Project Report On
Process of making a Website more Search Engine Friendly
AT Moksha Business Solutions, Info Tower - 2, Info city, Gandhinagar - 382009
Internal Guide Name:
Mr. DHARMENDRA SUTARIYA Mr. RAHUL SHAH
HANIF KHAN (2152704223) JAYESH SINH (2152704224)
Diploma Semester [5th] Computer Engineering Month, Year: 2005 – 2006 Submitted To, Department of Computer Engineering Shri B.S.Patel Polytechnic Ganpat Vidhyanagar, Kherva-382711
SHRI B.S.PATEL POLYTECHNIC, GANPAT VIDHYANAGAR, KHERVA
This is to certify that following students of Diploma semester [5th] (Computer Engineering) have completed their Project Work titled “SEO” SEARCH ENGINE OPTIMIZATION satisfactorily in partial fulfillment of requirement for the award of Diploma Engineering. In B.S.Patel Polytechnic is recorded to successful completion with great interest and enthusiasm.
* M.HANIF SIPAI (2152704223) * JAYESH SOLANKI (2152704224)
HOD CE DEPARTMENT
We would like to take this opportunity to say a lot of thanks to all of those that have helped us, provided direction technical information and advice at all stages of our project. We are deeply indebted to everyone at our company, Moksha Business. Particularly our HR Mr.Hussain Ahemad and project leader Mr.Aashish Khajuria who have been a consistent source of encouragement, and whom depth knowledge provided us the full understanding of the project. We sincerely wish to thank all those people who spared their valuable time for guiding and helping us to complete this project. We are also thankful to Mr.Dharmendra suttariya and Mr.Rahul Shah for their generous support and guidance that did a lot to shape the tone and scope of the project. And without forgetting finally we would like to thank our friends and our family members for their co-operation in making this project a success.
HANIF KHAN JAYESH SINH
SEARCH ENGINE OPTIMIZATION
1. SOFTWARE INTRODUCTION SOFTWARE DEFINATION. SOFTWARE CHARACTERISTICS. 2. Process Model In Production (SPIRAL MODEL) BRIEF HISTORY OF TRANSPORT. PROBLEM DOMAIN [CURRENT CONDITION]. WHY WE USE SPIRAL MODEL? 3. Customer Communication 4. Risk Analysis 5. Planning ESTIMATE OF RESOURCES. ESTIMATE OF COST. ESTIMATE OF SCHEDULES. 6. ENGINEERING BACKEND DATABASE MODULE FRONT-END MODULE Architectural Design (Data Centered architecture) DFD (ANALYSIS PART) ERD (DESIGN PART)
DATA DICTIONARY (DESIGN PART) Coding
7. Construction & Release PARTATIONING MODELLING TESTING 8. Customer Evolution REQUIREMENT SPECIFICATIONS. FUTURE EVOLUTION THROUGH SPIRAL MODEL. CONCLUSION. 9. Form Layouts 10.Bibliography
“Education is about the only thing lying around loose in the world, and it’s about the only thing a person can have as much as he’s willing to haul away. The tide of opportunity flows one’s way only once. If one swims with tide, he can attain unimaginable heights”.
The word opportunity brings us to MOKSHA BUSINESS, which does not need any prior introduction. Moksha is widely acclaimed to be a premier training house. Heralding a global phenomenon called ecommerce (Business on the Internet).You may wonder about the feasibility of e-commerce as the boom has happened and there is now a lull. We at Moksha know that it’s a lull before the storm. A storm that will re-revolutionize the e-commerce syndrome. B2B (Business to Business) models of e-commerce are being followed and implemented by global powerhouse as the fastest and most far reaching strategies that increases interaction and personalizes a relationship with the client. The future holds a lot of promise for ecommerce and is paving the way for new e-commerce based technologies like m-commerce (Mobile Commerce). Throughput and efficiencies of global firms are being vastly improved through ecommerce implementations thus increasing productivity of the corporation. E-commerce has opened a new global market with millions of customer bases and equal number of products. Speed is the new name of the game and ideas and innovations a foremost criterion. The
advantage being that an enterprising person does not need to work under someone anymore. You just easily become an entrepreneur on the net armed with a through knowledge about e-commerce concepts, innovative ideas and the will to succeed. Because “Success is not a result of spontaneous combustion. You must set yourself on fire”. Team MOKSHA will teach you how to fuel the fire that burns within you and help you succeed. Personality development helps strengthen the mind and lays the path to more secure future. Client retention record at Moksha Business Solutions is as impressive as the list of clients itself. The motto has always been to make service more focused on the goals that the client wants to achieve. Through easy to deploy and convenient to use call centre solutions, we help businesses grow in concert with customer satisfaction. Presently working for clients in the UK, US and Australia, Moksha Business Solutions is planning to increase its purview of operations. Moksha Business Solutions is one of the premier organizations involved in lead generation for personal finance and mobile companies. Multichannel support across voice, e-mail and Web is provided. Robust solutions delivered on time, on target and within the desired budget are the specialty of Moksha Business Solutions. Understanding that businesses suffer because of no one to entertain the calls of customers, Moksha Business Solutions ensures that there is always a personal and supportive representative to cater to customer queries. Automatic call distribution systems have also been introduced. Moksha Business Solutions is specializing in web design, development, e-commerce solution and web site maintenance services for all kinds of business requirements. We pride ourselves in economy, quality, accessibility and customer satisfaction.
Business Product Outsourcing is one of the fastest growing trends in any industry today. So rapid and so obvious has been its growth that many companies are jumping on the BPO bandwagon everyday. There are some that outsource just to save money, and others that take up their work just to make money. Moksha is not about either. MOKSHA have an enterprising and experienced team of experts who take care of all BPO needs. We realize that BPO is useful only if achieves the dual objective of better management and better finances. In a world that has no place for the average, every company has to deliver on every front. Each company has its weak or non-core areas that it cannot focus on. These are then outsourced to other companies that have an expertise in these areas and will handle them more efficiently. The goal of authorizing Moksha Business Solutions to generate leads will be to create business. Moksha Business Solutions accepts this as its prime responsibility. Maintaining an incessant supply of leads makes us a preferred partner for many organizations based in US, UK, and Australia. The leads generated are of an excellent quality. Maximum details of the customer are generated in order to help during the application period. Timely delivery of leads has been a regular feature at Moksha Business Solutions. Clients’ can mould the lead generation service to suit their own businesses, such as increased business or admin breaks etc. We assure our clients fresh deals that are not internet generated. All leads are verified twice over phone. This is an accepted practice at Moksha Business Solutions to confirm genuineness of leads generated.
Vision of Company
Moksha Business Solutions vision is to provide exceptional services and impeccable quality in web design, web development, e-commerce solution and web site maintenance services. They are constantly striving to get better and increase our niche and areas of expertise0. PREFACE
Search Engine Optimization (SEO) is the process of developing, customizing or retooling a website so that it achieves a sustained high ranking on Search Engine Results Pages (SERP) for important keywords or key phrases. Search Engine Optimization uses a combination of techniques, tools, and technical know-how to get results. Search Engine Submission, Link Popularity Building, Keyword Research Analysis, Search Engine Crawler Friendliness Analysis, Title & Meta Tags Optimization, Anchor Text Optimization and Search Engine Copywriting are just some of the methods utilized to improve a website's search engine placement. These Search Engine Optimization (SEO) methods leverage knowledge gained from a scientific understanding of the inner workings of search engines and a thorough analysis of trends in the constantly evolving field of Search Engine Optimization (SEO).
Benefits of Search Engine Optimization (SEO):-
Search engines generate nearly 90% of Internet traffic and are responsible for 55% of e-commerce transactions. Search Engine Promotion has shown to deliver the highest ROI, compared to any other type of marketing, both online and offline. Search engines bring motivated buyers to you and hence contribute to increased sales conversions.
Search Engine Optimization (SEO) offers an affordable entry point for marketing your website and an effective way to promote your business online. Search Engine Optimization (SEO) makes for a long-term solution is your access to sustained free traffic and a source of building brand name and company reputation.
W E B S I T E : A website is a collection of web pages, typically common to a particular domain name or sub domain on the World Wide Web on the Internet. A web page is a document, typically written in HTML, that is almost always accessible via HTTP, a protocol that transfers information from the website's server to display in the user's web browser. The pages of a website will be accessed from a common root URL called the homepage, and usually reside on the same physical server. The URLs of the pages organize them into a hierarchy, although the hyperlinks between them control how the reader perceives the overall structure and how the traffic flows between the different parts of the sites. Some websites require a subscription to access some or all of their content. Examples of subscription sites include many business sites, parts of many news sites, gaming sites, message boards, Web-based email services, and sites providing real-time stock market data. A website may be the work of an individual, a business or other organization and is typically dedicated to some particular topic or purpose. Any website can contain a hyperlink to any other website, so the distinction between individual sites, as perceived by the user, may sometimes be blurred. Websites are written in, or dynamically converted to, HTML (Hyper Text Markup Language) and are accessed using a software program
called a Web browser, also known as an HTTP client. Web pages can be viewed or otherwise accessed from a range of computer based and Internet enabled devices of various sizes, including desktop computers, laptop computers, PDAs and cell phones.
A website is hosted on a computer system known as a web server, also called an HTTP server, and these terms can also refer to the software that runs on these systems and that retrieves and delivers the Web pages in response to requests from the website users. Apache is the most commonly used Web server software (according to Net craft statistics) and Microsoft's Internet Information Server (IIS) is also commonly used.
A static website is one that has content that is not expected to change
frequently and is manually maintained by some person or persons using some type of editor software.
A dynamic website is one that has frequently changing information or
interacts with the user from various methods (HTTP cookies or database variables e.g., previous history, session variables, server side variables, e.g., environmental data) or direct interaction (form elements, mouseovers, etc.). When the Web server receives a request for a given page, the page is automatically retrieved from storage by the software in response to the page request. A site can display the current state of a dialogue between users, monitor a changing situation, or provide information in some way personalized to the requirements of the individual user.
Types of websites:There are many varieties of Web sites, each specializing in a particular type of content or use, and they may be arbitrarily classified in any number of ways. A few such classifications might include:
Affiliate: enabled portal that rend not only its custom CMS but also syndicated content from other content providers for an agreed fee. There are
• • • • •
• • • •
usually three relationship tiers. Affiliate Agencies (e.g. Commission Junction), Advertisers (e.g. EBay) and consumer (e.g. Yahoo). Blog (or web log) site: site used to log online readings or to post online diaries, which may include discussion forums (e.g. blogger, Xanga). Corporate website: used to provide background information about a business, organization, or service. Commerce site or ecommerce site: for purchasing goods, such as Amazon.com. Community site: a site where persons with similar interests communicate with each other, usually by chat or message boards, such as MySpace. Database site: a site whose main use is the search and display of a specific database's content such as the Internet Movie Database or the Political Graveyard. Development site: a site whose purpose is to provide information and resources related to software development, Web design and the like. Directory site: a site that contains varied contents which are divided into categories and subcategories, such as Yahoo! directory, Google directory and Open Directory Project. Download site: strictly used for downloading electronic content, such as software, game demos or computer wallpaper. Employment site: allows employers to post job requirements for a position or positions to be filled using the Internet to advertise world wide. A prospective employee can locate and fill out a job application or submit a résumé for the advertised position. Game site: a site that is itself a game or "playground" where many people come to play, such as MSN Games and Pogo.com. Information site: contains content that is intended to inform visitors, but not necessarily for commercial purposes, such as: RateMyProfessors.com, Free Internet Lexicon and Encyclopedia. Most government, educational and nonprofit institutions have an informational site. Java applet site: contains software to run over the Web as a Web application. Mirror (computing) site: A complete reproduction of a website. News site: similar to an information site, but dedicated to dispensing news and commentary. Personal homepage: run by an individual or a small group (such as a family) that contains information or any content that the individual wishes to include. Political site: A site on which people may voice political views. Review site: A site on which people can post reviews for products or services.
Search engine site: a site that provides general information and is intended as a gateway or lookup for other sites. A pure example is Google, and the most widely known extended type is Yahoo!. Web portal site: a website that provides a starting point, a gateway, or portal, to other resources on the Internet or an intranet. Wiki site: a site which users collaboratively edit (such as Wikipedia)
U R L : -
'Uniform Resource Locator' (URL) is a technical, Web-related term used in two distinct meanings:
in popular usage, it is a widespread synonym for Uniform Resource Identifier (URI) — many popular and technical texts will use the term "URL" when referring to URI; Strictly, the idea of a uniform syntax for global identifiers of network-retrievable documents was the core idea of the World Wide Web. In the early times, these identifiers were variously called "document names", "Web addresses" and "Uniform Resource Locators". These names were misleading, however, because not all identifiers were locators, and even for those that was, this was not their defining characteristic. Nevertheless, by the time the RFC 1630 formally defined the term "URI" as a generic term best suited to the concept, the term "URL" had gained widespread popularity, which has continued to this day.
I N T E R N E T : The Internet is the worldwide, publicly accessible network of interconnected computer networks that transmit data by packet switching using the standard Internet Protocol (IP). It is a "network of networks" that consists of millions of smaller domestic, academic,
business, and government networks, which together carry various information and services, such as electronic mail, online chat, file transfer, and the interlinked Web pages and other documents of the World Wide Web. As of January 11, 2007, 1.093 billion people use the Internet according to Internet World Stats. The most prevalent language for communication on the Internet is English. This may be a result of the Internet's origins, as well as English's role as the lingua franca. It may also be related to the poor capability of early computers to handle characters other than those in the basic Latin alphabet. After English (30% of Web visitors) the most-requested languages on the World Wide Web are Chinese 14%, Japanese 8%, Spanish 8%, German 5%, and French 5% (from Internet World Stats, updated January 11, 2007). By continent, 36% of the world's Internet users are based in Asia, 29% in Europe, and 21% in North America ( updated January 11, 2007). The Internet's technologies have developed enough in recent years that good facilities are available for development and communication in most widely used languages.
Internet and the Workplace:-
The Internet is allowing greater flexibility in working hours and location, especially with the spread of unmetered high-speed connections and Web applications. The Mobile Internet:-
The Internet can now be accessed virtually anywhere by numerous means. Mobile phones, data cards, and cellular routers allow users to
connect to the Internet from anywhere there is a cellular network supporting that device's technology.
Common uses of the Internet:• • • • • • • • • • •
E-mail File-sharing Instant messaging Internet fax Search engine World Wide Web Marketing Voice Telephony (VoIP) Streaming Media Collaboration Remote Access
H A T
S E A R C H
E N G I N E : -
A search engine is an information retrieval system designed to help find information stored on a computer system, such as on the World Wide Web, inside a corporate or proprietary network, or in a personal computer. The search engine allows one to ask for content meeting specific criteria (typically those containing a given word or phrase) and retrieves a list of items that match those criteria. This list is often sorted with respect to some measure of relevance of the results. Search engines use regularly updated indexes to operate quickly and efficiently.
Search engines are one of the primary ways that Internet users find Web sites. That's why a Web site with good search engine listings may see a dramatic increase in traffic. Everyone wants those good listings. Unfortunately, many Web sites appear poorly in search engine rankings or may not be listed at all because they fail to consider how search engines work. In particular, submitting to search engines is only part of the challenge of getting good search engine positioning. It's also important to prepare a Web site through "search engine optimization." Search engine optimization means ensuring that your Web pages are accessible to search engines and are focused in ways that help improve the chances they will be found. This next section provides information, techniques and a good grounding in the basics of search engine optimization. By using this information where appropriate, you may tap into visitors who previously missed your site. The guide is not a primer on ways to trick or "spam" the search engines. In fact, there are not any "search engine secrets" that will guarantee a top listing. But there are a number of small changes you can make to your site that can sometimes produce big results. Let's go forward and first explore the two major ways search engines get their listings; then you will see how search engine optimization can especially help with crawler-based search engines.
There are basically Three Types of Search Engines:
The term "search engine" is often used generically to describe both “Crawler-based Search Engines” and “Human-powered Directories”. These two types of search engines gather their listings in radically
different ways. And those that are a “Hybrid Search Engines” combination of two.
Crawler-Based Search Engines
Crawler-based search engines, such as Google, create their listings automatically. They "crawl" or "spider" the web, then people search through what they have found. Crawler-based engines send crawlers, or spiders, out into cyberspace. These crawlers visit a Web site, read the information on the actual site, read the site's Meta tags and also follow the links that the site connects to. The crawler returns all that information back to a central depository where the data is indexed. The crawler will periodically return to the sites to check for any information that has changed, and the frequency with which this happens is determined by the administrators of the search engine.
Human-Powered Search Engines
A human-powered directory, such as the Open Directory, depends on humans for its listings. You submit a short description to the directory for your entire site, or editors write one for sites they review. A search looks for matches only in the descriptions submitted. Changing your web pages has no effect on your listing. Things that are useful for improving a listing with a search engine have nothing to do with improving a listing in a directory. The only exception is that a good site, with good content, might be more likely to get reviewed for free than a poor site.
Human Powered Search Engines rely on humans to submit information that is subsequently indexed and catalogued. Only information that is submitted is put into the index. In both cases, when you query a search engine to locate information, you are actually searching through the index that the search engine has created; you are not actually searching the Web. These indices are giant databases of information that is collected and stored and subsequently searched. This explains why sometimes a search on a commercial search engine, such as Yahoo! or Google, will return results that are in fact dead links. Since the search results are based on the index, if the index hasn't been updated since a Web page became invalid the search engine treats the page as still an active link even though it no longer is. It will remain that way until the index is updated.
Hybrid Search Engines Or Mixed Results
In the web's early days, it used to be that a search engine either presented crawler-based results or human-powered listings. Today, it extremely common for both types of results to be presented. Usually, a hybrid search engine will favor one type of listings over another. For example, MSN Search is more likely to present human-powered listings from Look Smart. However, it does also present crawler-based results (as provided by Inktomi), especially for more obscure queries. So why will the same search on different search engines produce different results? Part of the answer to that is because not all indices are going to be exactly the same. It depends on what the spiders find or what the humans submitted. But more important, not every search engine uses the same algorithm to search through the indices. The algorithm is what the search engines use to determine the relevance of the information in the index to what the user is searching for.
One of the elements that a search engine algorithm scans for is the frequency and location of keywords on a Web page. Those with higher frequency are typically considered more relevant. But search engine technology is becoming sophisticated in its attempt to discourage what is known as keyword stuffing, or Spamdexing. Another common element that algorithms analyze is the way that pages link to other pages in the Web. By analyzing how pages link to each other, an engine can both determine what a page is about (if the keywords of the linked pages are similar to the keywords on the original page) and whether that page is considered "important" and deserving of a boost in ranking. Just as the technology is becoming increasingly sophisticated to ignore keyword stuffing, it is also becoming savvier to Web masters who build artificial links into their sites in order to build an artificial ranking. Without further qualification, search engine usually refers to a Web search engine, which searches for information on the public Web. Other kinds of search engine are enterprise search engines, which search on intranets, personal search engines, and mobile search engines. Different selection and relevance criteria may apply in different environments, or for different uses. Some search engines also mine data available in newsgroups, databases, or open directories. Unlike Web directories, which are maintained by human editors, search engines operate algorithmically or are a mixture of algorithmic and human input.
The first Web search engine was Wandex, a now-defunct index collected by the World Wide Web Wanderer, a web crawler developed by Matthew Gray at MIT in 1993. Another very early search engine, Aliweb, also appeared in 1993, and still runs today. The first "full text" crawler-based search engine was WebCrawler, which came out in 1994. Unlike its predecessors, it let users search for any word in any webpage, which became the standard for all major search engines since. It was also the first one to be widely known by the public. Also in 1994 Lycos came out, and became a major commercial endeavor. Soon after, many search engines appeared and vied for popularity. These included Excite, Infoseek, Inktomi, Northern Light, and AltaVista. In some ways, they competed with popular directories such as Yahoo!. Later, the directories integrated or added on search engine technology for greater functionality.
• • • • • • • • • •
Google Yahoo! Search Msn Search Windows Live Search Lycos AltaVista Alltheweb WebCrawler Northern Light Aliweb
G O O G L E : -
Around 2001, the Google search engine rose to prominence. Its success was based in part on the concept of link popularity and Page Rank. The number of other websites and WebPages that link to a given page is taken into consideration with Page Rank, on the premise that good or desirable pages are linked to more than others. The Page Rank of linking pages and the number of links on these pages contribute to the Page Rank of the linked page. This makes it possible for Google to order its
results by how many websites link to each found page. Google's minimalist user interface is very popular with users, and has since spawned a number of imitators. Google and most other web engines utilize not only Page Rank but more than 150 criteria to determine relevancy. The algorithm "remembers" where it has been and indexes the number of cross-links and relates these into groupings. Page Rank is based on citation analysis that was developed in the 1950s by Eugene Garfield at the University of Pennsylvania. Google's founders cite Garfield's work in their original paper. In this way virtual communities of WebPages are found. Teoma’s search technology uses a communities approach in its ranking algorithm. NEC Research Institute has worked on similar technology. Web link analysis was first developed by Jon Kleinberg and his team while working on the CLEVER project at IBM's Almaden Research Center. Google is currently the most popular search engine.
Benefits of a Google Search 1. Your search covers billions of URLs.
Google's index, comprised of billions of URLs, is the first of its kind and represents the most comprehensive collection of the most useful web pages on the Internet. While index size alone is not the key determinant of quality results, it has an obvious effect on the likelihood of a relevant result being returned.
2. You'll see only pages that are relevant to the terms you type.
Google only produces results that match all of your search terms or, through use of a proprietary technology, results that match very close variations of the words
you've entered (e.g., if you enter "comic book", we may return results for "comic books" as well). The search terms or their variants must appear in the text of the page or in the text of the links pointing to the page. This spares you the frustration of viewing a multitude of results that have nothing to do with what you're looking to find.
3. The position of your search terms is treated with respect.
Google analyzes the proximity of your search terms within the page. Google prioritizes results according to how closely your individual search terms appear and favors results that have your search terms near each other. Because of this, the result is much more likely to be relevant to your query.
4. You see what you're getting before you click.
Instead of web page summaries that never change, Google shows an excerpt (or "snippet") of the text that matches your query -- with your search terms in boldface -- right in the search results. This sneak preview gives you a good idea if a page is going to be relevant before you visit it.
5. You can feel lucky and save time doing it.
Google excels at producing extremely relevant results, and flat out nails many queries such as company names. We're so confident, in fact, that we've installed an "I'm Feeling Lucky" button, which takes you directly to the site of the highest ranked result in your search. Try it and let us know if our confidence is justified.
6. You can get it, even when it's gone.
As Google crawls the web, it takes a snapshot of each page and analyzes it to determine the page's relevance. You can access these cached pages if the original page is temporarily unavailable due to Internet congestion or server problems. Though the information on cached pages is frequently not the most recent version of a site, it usually contains useful information. Plus, your search terms will be highlighted in color on the cached page, making it easy to find the section of the page relevant to your query.
YA H O O : The two founders of Yahoo!, David Filo and Jerry Yang, Ph.D. candidates in Electrical Engineering at Stanford University started their guide in a campus trailer in February 1994 as a way to keep track of their personal interests on the Internet. Before long they were spending more time on their home-brewed lists of favorite’s links than on their doctoral dissertations. Eventually, Jerry and David's lists became too long and unwieldy, and they broke them out into categories. When the categories became too full, they developed subcategories ... and the core concept behind Yahoo! was born. In 2002, Yahoo! acquired Inktomi and in 2003, Yahoo! acquired Overture, which owned Alltheweb and AltaVista. Despite owning its own search engine, Yahoo! initially kept using Google to provide its users with search results on its main website Yahoo.com. However, in 2004, Yahoo! launched its own search engine based on the combined technologies of its acquisitions and providing a service that gave pre-eminence to the Web search engine over the directory.
M I C R O S O F T
S E A R C H : -
The most recent major search engine is MSN Search, owned by Microsoft, which previously relied on others for its search engine listings. In 2004 it debuted a beta version of its own results, powered by its own web crawler (called msn bot). In early 2005 it started showing its own results live. This was barely noticed by average users unaware of where results come from, but was a huge development for many webmasters, who seek inclusion in the major search engines. At the same time, Microsoft ceased using results from Inktomi, now owned by Yahoo!. In 2006, Microsoft migrated to a new search platform Windows Live Search, retiring the "MSN Search" name in the process.
Challenges faced by Search Engines:-
The Web is growing much faster than any present-technology search engine can possibly index (see distributed web crawling). In 2006, some users found major search-engines became slower to index new WebPages. Many WebPages are updated frequently, which forces the search engine to revisit them periodically. The queries one can make are currently limited to searching for key words, which may result in many false positives, especially using the default whole-page search. Better results might be achieved by using a proximity-search option with a search-bracket to limit matches within a paragraph or phrase, rather than matching random words scattered across large pages. Another alternative is using human operators to do the researching for the user with organic search engines. Dynamically generated sites may be slow or difficult to index, or may result in excessive results, perhaps generate 500 times more WebPages than average. Example: for a dynamic webpage which changes content based on entries inserted from a database, a search-engine might be requested to index 50,000 static WebPages for 50,000 different parameter values passed to that dynamic webpage. Many dynamically generated websites are not index able by search engines; this phenomenon is known as the invisible web. There are search engines that specialize in crawling the invisible web by crawling sites that have dynamic content, require forms to be filled out, or are password protected. Relevancy: sometimes the engine can't get what the person is looking for.
Some search-engines do not rank results by relevance, but by the amount of money the matching websites pay. In 2006, hundreds of generated websites used tricks to manipulate a search-engine to display them in the higher results for numerous keywords. This can lead to some search results being polluted with link spam or bait-and-switch pages which contain little or no information about the matching phrases. The more relevant WebPages are pushed further down in the results list, perhaps by 500 entries or more. Secure pages (content hosted on HTTPS URLs) pose a challenger for crawlers which either can't browse the content for technical reasons or won't index it for privacy reasons.
T H E
E A R C H
N G I N E
O R K S
A search engine operates, in the following order
1. 2. 3.
Web crawling Indexing Searching
Web search engines work by storing information about a large number of web pages, which they retrieve from the WWW itself. These pages are retrieved by a Web crawler (sometimes also known as a spider) — an automated Web browser which follows every link it sees. Exclusions can be made by the use of robots.txt. The contents of each page are then analyzed to determine how it should be indexed (for example, words are extracted from the titles, headings, or special fields called Meta tags). Data about web pages are stored in an index database for use in later queries. Some search engines, such as Google, store all or part of the source page (referred to as a cache) as well as information about the web pages, whereas others, such as AltaVista, store every word of every page they find. This cached page always holds the actual search text since it is the one that was actually indexed, so it can be very useful when the content of the current page has been updated and the search terms are no longer in it. This problem might be considered to be a mild form of link rot, and Google's handling of it increases usability by satisfying user expectations that the search terms will be on the returned webpage. This satisfies the principle of least astonishment since the user normally expects the search terms to be on the returned pages. Increased search relevance makes these cached pages very useful, even beyond the fact that they may contain data that may no longer be available elsewhere.
When a user comes to the search engine and makes a query, typically by giving key words, the engine looks up the index and provides a listing of best-matching web pages according to its criteria, usually with a short summary containing the document's title and sometimes parts of the text. Most search engines support the use of the Boolean terms AND, OR and NOT to further specify the search query. An advanced feature is proximity search, which allows users to define the distance between keywords. The usefulness of a search engine depends on the relevance of the result set it gives back. While there may be millions of WebPages that include a particular word or phrase, some pages may be more relevant, popular, or authoritative than others. Most search engines employ methods to rank the results to provide the "best" results first. How a search engine decides which pages are the best matches, and what order the results
should be shown in, varies widely from one engine to another. The methods also change over time as Internet usage changes and new techniques evolve. Most Web search engines are commercial ventures supported by advertising revenue and, as a result, some employ the controversial practice of allowing advertisers to pay money to have their listings ranked higher in search results. Those search engines which do not accept money for their search engine results make money by running search related ads alongside the regular search engine results. The search engines make money every time someone clicks on one of these ads. The vast majorities of search engines are run by private companies using proprietary algorithms and closed databases, though some are open source.
Major Search Engines: The Same, but Different:-
All crawler-based search engines have the basic parts described above, but there are differences in how these parts are tuned. That is why the same search on different search engines often produces different results. Some of the significant differences between the major crawler-based search engines are summarized on the Search Engine Features Page. Information on this page has been drawn from the help pages of each search engine, along with knowledge gained from articles, reviews, books, independent research, tips from others and additional information received directly from the various search engines.
H A T
R A W L E R
A web crawler (also known as a Web spider or Web robot) is a program or automated script which browses the World Wide Web in a methodical, automated manner. Other less frequently used names for Web crawlers are ants, automatic indexers, bots, and worms (Kobayashi and Takeda, 2000). This process is called Web crawling or spidering. Many legitimate sites, in particular search engines, use spidering as a means of providing upto-date data. Web crawlers are mainly used to create a copy of all the visited pages for later processing by a search engine, that will index the downloaded pages to provide fast searches. Crawlers can also be used for automating maintenance tasks on a Web site, such as checking links or validating HTML code. Also, crawlers can be used to gather specific types of information from Web pages, such as harvesting e-mail addresses (usually for spam). A Web crawler is one type of bot, or software agent. In general, it starts with a list of URLs to visit, called the seeds. As the crawler visits these URLs, it identifies all the hyperlinks in the page and adds them to the list of URLs to visit, called the crawl frontier. URLs from the frontier are recursively visited according to a set of policies.
There are three important characteristics of the Web that generate a scenario in which Web crawling is very difficult: its large volume, its fast rate of change, dynamic page generation, containing a wide variety of possible crawl able URLs. The large volume implies that the crawler can only download a fraction of the Web pages within a given time, so it needs to prioritize its downloads. The high rate of change implies that by the time the crawler is downloading the last pages from a site, it is very likely that new pages have been added to the site, or that pages have already been updated or even deleted. The recent increase in the number of pages being generated by serverside scripting languages has also created difficulty in those endless combinations of HTTP GET parameters exist, only a small selection of which will actually return unique content. For example, a simple online photo gallery may offer three options to users, as specified through HTTP GET parameters. If there exist four ways to sort images, three choices of thumbnail size, two file formats, and an option to disable user-provided contents, then that same set of content can be accessed with forty-eight different URLs, all of which will be present on the site. This mathematical combination creates a problem for crawlers, as they must sort through endless combinations of relatively minor scripted changes in order to retrieve unique content. As Edwards noted, "Given that the bandwidth for conducting crawls is neither infinite nor free it is becoming essential to crawl the Web in not only a scalable, but efficient way, if some reasonable measure of quality or freshness is to be maintained. A crawler must carefully choose at each step which pages to visit next. The behavior of a Web crawler is the outcome of a combination of policies:
A selection policy that states which pages to download. A re-visit policy that states when to check for changes to the pages.
A politeness policy that states how to avoid overloading websites. A parallelization policy that states how to coordinate distributed web crawlers.
Crawler identification:Web crawlers typically identify themselves to a Web server by using the User-agent field of an HTTP request. Web site administrators typically examine their web servers’ log and use the user agent field to determine which crawlers have visited the Web server and how often. The user agent field may include a URL where the Web site administrator may find out more information about the crawler. Spam bots and other malicious Web crawlers are unlikely to place identifying information in the user agent field, or they may mask their identity as a browser or other well-known crawler. It is important for Web crawlers to identify themselves so Web site administrators can contact the owner if needed. In some cases, crawlers may be accidentally trapped in a crawler trap or they may be overloading a Web server with requests, and the owner needs to stop the crawler. Identification is also useful for administrators that are interested in knowing when they may expect their Web pages to be indexed by a particular search engine.
Examples of Web crawlers:The following is a list of published crawler architectures for general-purpose crawlers (excluding focused Web crawlers), with a brief description that includes the names given to the different components and outstanding features:
RBSE :- (Eichmann, 1994) was the first published web crawler. It was based on two programs: the first program, "spider" maintains a queue in a relational database, and the second program "mite", is a modified www ASCII browser that downloads the pages from the Web. WebCrawler :- (Pinkerton, 1994) was used to build the first publicly-available full-text index of a subset of the Web. It was based on lib-WWW to download pages, and another program to parse and order URLs for breadth-first exploration of the Web graph. It also included a real-time crawler that followed links based on the similarity of the anchor text with the provided query. World Wide Web Worm: - (McBryan, 1994) was a crawler used to build a simple index of document titles and URLs. The index could be searched by using the grep UNIX command. Google Crawler :- (Brin and Page, 1998) is described in some detail, but the reference is only about an early version of its architecture, which was based in C++ and Python. The crawler was integrated with the indexing process, because text parsing was done for full-text indexing and also for URL extraction. There is an URL server that sends lists of URLs to be fetched by several crawling processes. During parsing, the URLs found were passed to a URL server that checked if the URL has been previously seen. If not, the URL was added to the queue of the URL server. Cobweb: - (da Silva et al., 1999) uses a central "scheduler" and a series of distributed "collectors". The collectors parse the downloaded Web pages and send the discovered URLs to the scheduler, which in turn assign them to the collectors. The scheduler enforces a breadth-first search order with a politeness policy to avoid overloading Web servers. The crawler is written in Perl. Mercator :- (Heydon and Najork, 1999) is a modular web crawler written in Java. Its modularity arises from the usage of interchangeable "protocol modules" and "processing modules". Protocols modules are related to how to acquire the Web pages (e.g.: by HTTP), and processing modules are related to how to process Web pages. The standard processing module just parses the pages and extracts new URLs, but other processing modules can be used to index the text of the pages, or to gather statistics from the Web. Web Fountain :-( Edwards’s et al., 2001) is a distributed, modular crawler similar to Mercator but written in C++. It features a "controller" machine that coordinates a series of "ant" machines. After repeatedly downloading pages, a change rate is
inferred for each page and a non-linear programming method must be used to solve the equation system for maximizing freshness. The authors recommend using this crawling order in the early stages of the crawl, and then switching to a uniform crawling order, in which all pages are being visited with the same frequency. PolyBot :-[Shkapenyuk and Suel, 2002] is a distributed crawler written in C++ and Python, which is composed of a "crawl manager", one or more "downloader’s" and one or more "DNS resolvers". Collected URLs are added to a queue on disk, and processed later to search for seen URLs in batch mode. The politeness policy considers both third and second level domains (e.g.: www.example.com and www2.example.com are third level domains) because third level domains are usually hosted by the same Web server. WebRACE: - (Zeinalipour-Yazti and Dikaiakos, 2002) is a crawling and caching module implemented in Java, and used as a part of a more generic system called eRACE. The system receives requests from users for downloading Web pages, so the crawler acts in part as a smart proxy server. The system also handles requests for "subscriptions" to Web pages that must be monitored: when the pages change, they must be downloaded by the crawler and the subscriber must be notified. The most outstanding feature of Web RACE is that, while most crawlers start with a set of "seed" URLs, Web RACE is continuously receiving new starting URLs to crawl from. FAST Crawler: - (Risvik and Michelsen, 2002) is the crawler used by the FAST search engine, and a general description of its architecture is available. It is a distributed architecture in which each machine holds a "document scheduler" that maintains a queue of documents to be downloaded by a "document processor" that stores them in a local storage subsystem. Each crawler communicates with the other crawlers via a "distributor" module that exchanges hyperlink information. Labrador: - is a closed-source web crawler that works with the Open Source project Terrier search engine. In addition to the specific crawler architectures listed above, there are general crawler architectures published by Cho (Cho and GarciaMolina, 2002) and Chakrabarti (Chakrabarti, 2003).
Web crawler architectures:-
High-level architecture of a standard Web crawler
The Role of Web Search Engines:Search engines do not always provide the right information, but rather often subject the user to a deluge of disjointed irrelevant data. Search engines do not manage information, at least not in the conventional business sense. They do not, in fact, search the Internet when the search button is clicked. Their crawlers and spiders have done their work in advance based on their own criteria and categories they want to include in their database. Conducting successful searches is dependent on knowing how the engines work. This knowledge also helps to get a website noticed by the search engines.
Search engines provide Internet users a tool for locating and retrieving data from the World Wide Web, based upon keywords supplied by the user. All search engines share the following basic elements:
A spider (also referred to as a crawler or a bot) that goes onto the
web and reads pages following hypertext links to other pages and sites on the web; 2) A program that configures the pages that have been read by the spider into an index. 3) A second program that takes user-supplied keywords and searches the index, essentially a process of comparison and matching based on the engine's criteria, returning the user a set of results. The results are usually ranked according to how closely they match the keywords, as defined by the search engine's set of variable criteria.
Behind the Scenes:•
An important point regarding search engines is that the user's search is a search of the engine's index and not of the web itself. The web search, performed by the spider, occurs earlier. The amount of the web crawled by the spider determines the size of the engine's library of web documents. No search engine is able to cover the entire web. Pages appear and disappear hourly.
The size of the web is mammoth, and it continues to grow at an exponential rate. Including 100% of the web is not possible or even desirable, as the quality of web-based documents varies from junk to highly respected, reliable, and relevant data. Search engines differ as to what percentage of the web they cover, their techniques used in obtaining coverage, and their selectivity in eliminating junk.
How frequently the engine's spider crawls the web will also influence the user's choice of search tool. Frequent crawling ensures that current documents are included in the index, and documents no longer available (dead links) are eliminated. This is of significant importance in an age where old information is quickly superseded by the new.
All search engines support single-word queries. The user simply types in a keyword and presses the search button. Most engines also support multiple-word queries. However, the engines differ as to whether and to what extent they support Boolean operators (such as "and" and "or") and the level of detail supported in the query. More specific queries will enhance the relevance of the user's results.
The final step is the search, locate, and match process itself. Location and frequency of the keywords' occurrence within the document are the most common criteria used by search engines in matching and ranking. Words located in the title of a document are awarded a greater weight, as are words located in HTML Meta tags. Those located in subject descriptions and those located higher up (i.e., earlier) on the page are also more highly weighted. The frequent recurrence of the keywords results in a greater weight; however, frequency is subject to certain limitations.
Most search engines recognize and defend against a practice called Spamdexing. "Spamdexing" refers to the practice of tailoring the design and content of a web document in an effort to cause search engines to index it favorably. The actual content may not be relevant. The most common practice is to simply overload the web
page with common terms in the initial part of the document, in a way that is invisible to the user but readable by the search engine spider and index program. The detection of Spamdexing by search engines will cause most to either omit the site from its results or to rank it at the bottom.
Variations on the Search Engine:-
A search engine is not the same as a "subject directory." A subject directory does not visit the web, at least not by using the programmed, automated tools of a search engine. Websites must be submitted to a staff of trained individuals entrusted with the task of reviewing, classifying, and ranking the sites. Content has been screened for quality, and the sites have been categorized and organized so as to provide the user with the most logical access. Their advantage is that they typically yield a smaller, but more focused, set of results. The most significant limitation of a subject directory is the time lag involved in reviewing and categorizing sites. Yahoo!, the original subject directory and the most often consulted web search tool, has been criticized for this. A subject directory is also only as good as the classification scheme used by the directory's managers. Gaining in popularity is the combining of a subject directory with a search engine to form a portal. A portal is intended to be an Internet user's one-stop point of entry to the web. Portals often provide the user with a number of specialized search engines (e.g., focusing only on news or sports), sometimes tied in to other sites. The appeal lies in the array of customizable and personalized features they can offer. For example, portals frequently offer a periodic status report on a stock portfolio, free web-based e-mail, and a customizable home page with a menu of
favorite destinations. Portals often store and track a user's personal data, favorite topics, and frequent searches to provide personalized services (e.g., horoscopes, weather reports, or breaking news). The array of services will continue to expand as portals compete to build and keep audiences that can be sold to advertisers.
The Future of Search Engines:-
Not all search engines are going to be successful. To date, Yahoo! is the only one to turn a profit. If a sustainable business model is not eventually found, these companies will fail. The current strategy is to fold a search engine into a larger portal site. An increasing number of personalized services (e.g., paging services, weather reports, and chat rooms) are being added by the portals to increase the likelihood of the user logging on and staying put. Portals are becoming one-stop web organizers. America Online and Yahoo! currently dominate the portal race. However, Lycos's recent acquisition of HotBot, Disney's acquisition of Infoseek and newcomer Northern Light are examples of different solutions to the positioning question.
The Role of Search Engine Rank in Driving Traffic to Your Website:
Having a desirable search engine rank is ideal for driving traffic to your website. Generally, the majority of a website's traffic comes through internet users' use of the search engines. A good search engine rank is really important considering that over 80% of traffic for most websites is directed via search engines and most users of search engines only click through to websites that have a search engine rank within the first three pages of the search engine results. There are a number of search engines today. Each search engine has its own algorithms which are rules that determine how websites are placed in their search engine rank. Thus, a search engine optimization strategy that provides a desirable search engine rank in one engine may not produce good results in another search engine. So, when trying to achieve a beneficial search engine rank, you really need to focus on one, or maybe two search engines for search engine rank purposes. If your website gets a good search engine rank in the other search engines, you can consider that a blessing. Since your optimization strategy does need to be focused to be successful, you may wonder which search engine or engines you should focus on to achieve a search engine rank within the first three pages of results. I recommend focusing your search engine rank attempts on Google first and Yahoo! second. Getting a good search engine rank with Google will undoubtedly drive loads of traffic to your website. Statistics have repeatedly shown that Google is the most used search engine with Yahoo! coming in second. These two major search engines also power some of the smaller search engines, meaning that the results generated by small search engines draw their results from the major search engines. So, if you get a good search engine rank in Google or Yahoo!, you will likely get a decent search
engine rank in some of the search engines they power such as MSN, AOL, Ask Jeeves, and Alta Vista. Another reason to strive for a top search engine rank in Google is that it is an organic search engine, thus the results the search engine displays are based on the quality and relevance of the information on a website to the key terms searchers use rather than being based on paid advertising and who can pay the most to reach the top in search engine rank. When you do a search using Google, you will notice that sponsored links that appear at the top and at the right of the screen are clearly marked so the searcher knows they are paid ads. All other results generated are generated based on algorithms, not paid advertisements. With Yahoo!, you can also tell which results are generated as a result of sponsored links. If you cannot achieve a search engine rank in the first three pages of Google or Yahoo!, you can always opt for paid advertising that will get you listed in the sponsored links. Google's pay-per-click advertising program is called Ad Words. Yahoo!'s cost-per-click advertising program is offered through Yahoo! Search marketing (formerly Overture). With both of these programs, you can get your website advertised as sponsored links in search engine results even if you cannot organically achieve a search engine rank. Both programs have budgeting features which enable you to control the amount of money spent through your pay-per-click/cost-per-click advertising campaign. The difference between organic search engine rank listings and pay-perclick/cost-per-click search engine rank is that organic listings don't cost you anything while with per-click listings you are charged the amount of your keyword bid for every click- through to your website. Whether you get a search engine rank in the search results naturally or through paid advertising, the search engine rank is vital for driving traffic to your website.
Building a website is just the beginning. Most websites fail for lack of traffic. In order to get traffic, the most cost-effective step you can take is to prepare your website properly so that it can come up in the first few pages of results when someone searches for your most important keywords or keyword phrases. The preparation of web pages so that they are search engine friendly is known as Search Engine Optimization or SEO.
Basic Benefits of SEO:-
• • • • • • • •
• • •
It has a positive psychological impact on a visitor. Help you create a brand identity. Even the ‘brand recall’ would be much higher. Increase in targeted on-line traffic. Better web site positioning. Dominate in competition with your mirror sites. Fast measurable ROI. Increased and boosted product sales and online visibility. Lower client acquisition costs. Broaden web-marketing share. Compete efficiently against larger competitors. Continuous Visibility. Make the most out of the best tool for advertising. The cheapest marketing tool even on the net
Search Engine Optimization can be divided into two steps: On-Page Optimization Optimization
Page Search Engine Optimization:
The on-page optimization elements are:
• • • • • • • • • • • • • •
Titles, Headings Meta tags Clean Design Navigation Content - Keywords URLs Sitemaps File size Site size Domain name Site age Images - ALT Outgoing links And others…
On page optimization is the process by which various elements on an individual web page are structured so that the web page can be found by the search engines for specific keyword(s) or keyword phrases. On page optimization will not guarantee any top rating within a search engine, only off page optimization can offer that guarantee. However,
off page optimization is far more effective WHEN on page optimization is in place. On page optimization is not difficult. It does however take time to make sure all the pieces are in place. This kind of optimization should occur not only on the main web page of a web site, but on every single content page within that site. On-page optimization is the preparation of the actual pages of the website. Search engines are trying to provide their users with the most relevant sites for any particular query. Maybe your website has a lot of material and is relevant to searches for your product, but it may not have been designed or written in a way that is search engine friendly. The correction of design mistakes and the rewriting of the website’s text and Meta tags is what are known as on-page search engine optimization. But, even if you have web pages that are well-designed and well-written, you may still be buried in the rankings. Google, the most popular search engine, relies heavily on the link popularity of websites in its formula (or algorithm) determining the most relevant answers to any particular query. If your website does not have inbound links from other websites it will not achieve high rankings in highly competitive categories. Boosting the link popularity of a particular web page or website is also known off-page or off-site search engine optimization. We rewrite your headlines and important text so that they are search engine friendly, and most importantly, so that they are user friendly! It should always be remembered that your site should be built for the satisfaction of the people who will read it and not for the satisfaction of search engine robots.
The goal of your site is to sell your product or service, and SEO service is aimed at helping you to get more traffic and to convert visitors into buyers.
The following is a checklist you should run through every time you create a new webpage: File Name:The file name of a page is taken into account when search engine spider your website. You should therefore always try to use your main keyword in the filename. If you are using dynamic pages you could use mod_rewrite in the *.htacces file to rewrite your file names and URLs. Title Tags:The page title is what most search engines will show on their search result pages. It is vital that you make extra effort into getting this right and it should not (only) is the websites name. Is your page title defined and is it containing your main keyword? Is it short and does it describe the page content correctly? Preferably 5-8 words. Example: <Title>on page Optimization</title> Meta tags:Although most search engines don’t pay much attention to meta tags it is still important to add your meta tags for on page optimization. Some search engines still use them to display your page in their lists. Try to make them different for every webpage. Is your description defined and is it describing your content? *Preferably a maximum of 25words in your description tag. Are your keywords defined and do they also list misspelling and typos? Are keywords not repeated more than 3 times?
Preferably 1-3 main keywords and 1 keyword phrase in your keyword tag. Example: <Meta name="description" content="A quick guide and check list for on page Optimization. Covering Titles Tags, Meta Tags, Heading Tags, Plain Text and Image Alt Tags"> <meta name="keywords" content="On page Optimization, Meta Tags, Heading Tags, Title Tags, How to do on page optimizing, on-page, on page, optimization">
Right after your page title your heading tags are of most importance. Search engines use them to define the importance of your keywords. If you find you’re heading tags too big and use css to define how they look.
Make sure you are using only one <h1> tag. Is your <h1> short and does it contain your main keyword? Preferably your <h1> tag should match your page title. Are your <h2> tags defined and do they contain variations on your keyword? Could you be categorizing even more with <h3> tags? Have you styled all heading tags using CSS to improve page design? Example: <h1>on page Optimization</h1> <h2>Page Title</h2>
<p> the page title is what most search engines...</p> <h2>Mata Tags</h2> <p>although most search engines don’t pay...</p> Plain Text:Spread out your keywords throughout the pages paragraphs (<p>) and repeat them a couple of times. Try not to make the text unnatural, because you still want people to be able to read it. Readability should be more important here than on page optimization. Are your keywords used at least several times? Try not to have one keyword take up more than 5% of the total amount of words used. Is your main keyword mention at the beginning of the page and at the end? Preferably use a maximum of 7 repetitions per word. Distribute 4 in the top third of the page and 3 in the bottom. Use at least every keyword in the first 7500 characters of the page.
Bold and Italic Text:A bold or italic word has more weight for most search engines. So try and use your main keyword in bold <b> and italic <i>. Please note that the use of <strong> and <em> instead of <b> and <i> does not matter. Google has stated that they treat this code exactly the same.
Image Alt Tags and Names:Search engines don’t just look at text they also take images into account. Try to Use Image Alt tags and name images using your main keyword. Example:
<img src=’images/onpage_optimization.jpg’ alt=’On page Optimization’ /> Promotional Comments:Some search engines, like the Inktomi search engine, read comments in the “<!--" format. If you place a keyword rich paragraph after such a comment at the top of the page this could help keyword weighting and your keyword relevancy. Example: <!-- On page Optimization, by optimizing Title Tags, Meta Tags, Heading Tags, Plain Text, Image Alt Tags and Promotional Content is a great way to optimize WebPages. By using good value keywords targeted traffic can be sent to the page by a search engine. On page Optimization is a technique used for Search Engine Optimization. /--> That’s the basics of on page Optimization. Once you’ve done this you can start with off page Optimization.
There are two important factors in any Search Engine Optimization Campaign:
Serve the Site Visitors what they Want. Serve the Search Engine Spiders what they Want.
Search Engine Submissions:Once you've optimized your web pages and uploaded them to your server, your next step will be to submit your main pages to the Search Engines. However, don't submit your pages to Google. Your pages will rank much higher if you allow this Search Engine to find your pages on its own. You may want to consider creating a site map for your site and submit this page to Google instead. A site map is a page that outlines how your pages are set up and linked together. If you design a site map with links to all of your pages, the Search Engine robots can easily spider and index them. Taking the time to optimize each of your web pages is the most important step you can take towards ranking high in the Search Engines and driving your more traffic to your web site.
Search Engine Submission: Getting Listed "Search engine submission" refers to the act of getting your web site listed with search engines. Another term for this is search engine registration. Getting listed does not mean that you will necessarily rank well for particular terms, however. It simply means that the search engine knows your pages exist.
Think of it as a lottery. Search engine submission is akin to your purchasing a lottery ticket. Having a ticket doesn't mean that you will win, but you must have a ticket to have any chance at all. Search Engine Optimization: Improving the Odds "Search engine optimization" refers to the act of altering your site so that it may rank well for particular terms, especially with crawler-based search engines (later in this guide, we will explain what these are). Returning to the lottery example, let's assume there was a way to increase the odds of winning by picking your lottery numbers carefully. Search engine optimization is akin to this. It's making sure that the numbers you select are more likely to win than purchasing a set of numbers at random.
Search Engine Placement & Positioning: Ranking Well Terms such as "search engine placement," "search engine positioning" and "search engine ranking" refer to a site actually doing well for particular terms or for a range of terms at search engines. This is the ultimate goal for many people -- to get that "top ten" ranking for a particular keyword or search terms.
Search Engine Marketing & Promotion: The Overall Process Terms such as "search engine marketing" or "search engine promotion" refer to the overall process of marketing a site on search engines. This includes submission, optimization, managing paid listings and more. These terms also highlight the fact that doing well with search engines is not just about submitting right, optimizing well or getting a good rank for a particular term. It's about the overall job of improving how your site interacts with search engines, so that the audience you seek can find you. On To Submission The next few "essentials" pages cover the basics of search engine submission. If all you do is follow the instructions on these essentials pages, you'll receive traffic from search engines. However, if you have time, you should also read beyond the essentials to understand how optimization can increase your traffic and other ways you can market your site with search engines.
2.Off-Page Search Engine Optimization:
The off-page seo elements include:
• • •
Page Rank Back links Link Exchange
• • • • •
Anchor text Relevancy Directories Traffic Bookmark
Off page Optimization is optimization done off the Page, like getting relevant links from other sites, link exchange with quality relevant sites, choosing relevant anchor text from the perfect location on the different pages of different sites etc. Link Popularity Building:-
Links are the ultimate driving force behind all Search Engines today. A quality back link not only helps in Search Engine Ranking but is also capable of developing your brand as unique, bringing quality targeted traffic to your site. The importance of link popularity varies with each search engine, but the basic premise is that every link to your web site is an endorsement of your site's quality, and the more endorsements you have, the higher your site is likely to be listed. Search Engine Optimization helps building quality text links to your site, thus increasing the visibility of your site.
One Way Linking:-
Most Search Engines including Google, place a great amount of value and weight on a site’s link popularity. The basic idea, behind this is you have to have more incoming links than outgoing links. If your site is viewed by the public as a quality site, then you will automatically
receive random linking to your site with little work on your part. Two ways to increase your chances of one way linking is to provide quality content for your site and the other one is to get your site listed in the major human edited directories. Currently the two most important human edited directories are Yahoo and the Open Directory Project (www.dmoz.org). Reciprocal Link Building:Reciprocal linking or two way linking is where you provide a return link to the other Web site in return for a link to yours, which is less valuable than one way direct links. This type of linking is done when it is difficult to get one way links. Many search engines use a search algorithm which analyzes the quality of the site linking to you.
Buying Text Link Advertisements:-
This is one of the more expensive options in a link building campaign. Text link advertisements are bought from relevant sites to increase link popularity of the site. This is also the quickest method to give a quick boost to your site’s rankings. While buying the links, the main criterion is Page Rank. Linking with pages having a high Page Rank is the only way to give a quick boost to your site’s rankings and drastically improve targeted traffic.
Nowadays everyone likes to leave comments in blogs. You can do it as well. And add your link with the comment. However remember not to spam blog sites, if you add a comment make sure that comment has
something to do with the topic. It is also wise to post in blogs which you have some knowledge about. What all this being said, it is important to note that links from High Quality Sites are better for rankings than links from low quality sites. A few links from reputable sites is worth more than a lot of links from unknown sites. So if you can, try getting links from these sites instead. There are a lot of SEO experts who say that off page optimization is more important than on page optimization. While that is not completely wrong, it is very difficult (but not impossible) for a good off page optimization to work without excellent content. And that is ultimately what is on your web pages.
How Search Engines Rank Web Pages:Search for anything using your favorite crawler-based search engine. Nearly instantly, the search engine will sort through the millions of pages it knows about and present you with ones that match your topic. The matches will even be ranked, so that the most relevant ones come first. Of course, the search engines don't always get it right. Non-relevant pages make it through, and sometimes it may take a little more digging to find what you are looking for. But, by and large, search engines do an amazing job. As WebCrawler founder Brian Pinkerton puts it, "Imagine walking up to a librarian and saying, 'travel.' They are going to look at you with a blank face." OK -- a librarian's not really going to stare at you with a vacant expression. Instead, they're going to ask you questions to better understand what you are looking for. Unfortunately, search engines don't have the ability to ask a few questions to focus your search, as a librarian can. They also can't rely on judgment and past experience to rank web pages, in the way humans can. So, how do crawler-based search engines go about determining relevancy, when confronted with hundreds of millions of web pages to sort through? They follow a set of rules, known as an algorithm. Exactly how a particular search engine's algorithm works is a closely-kept trade secret. However, all major search engines follow the general rules below.
Location, Location, Location...and Frequency:One of the the main rules in a ranking algorithm involves the location and frequency of keywords on a web page. Call it the location/frequency method, for short. Remember the librarian mentioned above? They need to find books to match your request of "travel," so it makes sense that they first look at books with travel in the title. Search engines operate the same way. Pages with the search terms appearing in the HTML title tag are often assumed to be more relevant than others to the topic. Search engines will also check to see if the search keywords appear near the top of a web page, such as in the headline or in the first few paragraphs of text. They assume that any page relevant to the topic will mention those words right from the beginning. Frequency is the other major factor in how search engines determine relevancy. A search engine will analyze how often keywords appear in relation to other words in a web page. Those with a higher frequency are often deemed more relevant than other web pages.
Spice in the Recipe:Now it's time to qualify the location/frequency method described above. All the major search engines follow it to some degree, in the same way cooks may follow a standard chili recipe. But cooks like to add their own secret ingredients. In the same way, search engines add spice to the location/frequency method. Nobody does it exactly the same, which is one reason why the same search on different search engines produces different results.
To begin with, some search engines index more web pages than others. Some search engines also index web pages more often than others. The result is that no search engine has the exact same collection of web pages to search through. That naturally produces differences, when comparing their results. Search engines may also penalize pages or exclude them from the index, if they detect search engine "spamming." An example is when a word is repeated hundreds of times on a page, to increase the frequency and propel the page higher in the listings. Search engines watch for common spamming methods in a variety of ways, including following up on complaints from their users.
Off the Page Factors:Crawler-based search engines have plenty of experience now with webmasters who constantly rewrite their web pages in an attempt to gain better rankings. Some sophisticated webmasters may even go to great lengths to "reverse engineer" the location/frequency systems used by a particular search engine. Because of this, all major search engines now also make use of "off the page" ranking criteria. Off the page factors are those that a webmasters cannot easily influence. Chief among these is link analysis. By analyzing how pages link to each other, a search engine can both determine what a page is about and whether that page is deemed to be "important" and thus deserving of a ranking boost. In addition, sophisticated techniques are used to screen out attempts by webmasters to build "artificial" links designed to boost their rankings. Another off the page factor is click through measurement. In short, this means that a search engine may watch what results someone selects for a particular search, and then eventually drop high-ranking pages that aren't
attracting clicks, while promoting lower-ranking pages that do pull in visitors. As with link analysis, systems are used to compensate for artificial links generated by eager webmasters. Google's view of Search Engine Optimization (SEO) With the addition of this document to their website, the people at Google appear to be trying to frighten people away from search engine optimization altogether. Although they say that "Many SEOs provide useful services for website owners", they finish the sentence by describing the range of what those useful services are:- "from writing copy to giving advice on site architecture and helping to find relevant directories to which a site can be submitted". They say that an SEO's useful services include:- writing copy, giving advice on site architecture and helping to find relevant directories.... These can be part of search engine optimization, of course, but they are not what is widely understood by the term search engine optimization; i.e. optimizing pages to rank highly. Even writing copy doesn't suggest anything to do with seo copywriting, and giving advice on site architecture is to do with website design and not search engine optimization, although an SEO can advise on it with respect to crawling. It is quite clear what sort of things Google considers to be SEO, and it isn't anything to do with optimizing or, if it is, it's only on the fringe of optimizing. The document goes on to say, "there are a few unethical SEOs who have given the industry a black eye through their overly aggressive marketing efforts and their attempts to unfairly manipulate search engine results". The implication is that search engine optimizers who go further than the sort of things that Google mentions, and actually optimize pages to improve rankings (manipulate search results), are unethical. Google clearly views any sort of optimizing to improve rankings as unethical.
Later in the document, Google lists a number 'credentials' that reputable search engine optimizers should have. In Google's view, a search engine optimization company should employ a reasonable number of staff (individual SEOs are not reputable), they should offer "a full and unconditional money-back guarantee", they should report "every spam abuse that it finds to Google", and more, and they warn people against those who don't measure up. But there isn't a search engine optimizer in the world, individual or company, who doesn't fall foul of Google's 'credentials'. There are people, who can write copy (not seo copy), advise on site structure and even find directories to submit to, but they aren't search engine optimizers and, in terms of rankings, they are of limited value. The purpose of search engine optimization is to improve a website's rankings. Google see that as manipulating the search results, and they don't approve. The impression given by their document is that they are trying hard to scare website owners into not employing search engine optimization services to improve their website's rankings. That, in my opinion, is unethical.
What Ethical Search Engine Optimization Really Does Suppose there are 1000 hotels in New York, each of which has a website. When somebody types "New York hotels" into a search engine, all 1000 websites are equally relevant to the search. Because of the way that Google and other engines have been designed, they normally display the results 10 at a time. But which of the 1000 hotel sites will be displayed in the first 10, which of them will be displayed in the second 10......and which will be placed right at the bottom of the pile?
It is well-known that searchers don't look very far down the results, so the sites that are nearer the top will take all the business, and those that are further down will get none. But which sites will be at the top? Google uses its algorithms to determine the order of the results. It is patently obvious that all 1000 equally relevant websites will not be displayed on the first results page (the top 10). It is also obvious that equally relevant sites cannot be displayed where they belong. Some necessarily become more equal than others. So what if the owner of one of the websites decides to try and push his site to the top? Is that wrong? Of course not. The site is just as relevant as the top ones; it's just that Google cannot satisfy all the relevant sites. This is where ethical search engine optimization comes in. Search engine optimization optimizes a website's pages, so that they will be ranked higher in the search results for the most relevant search terms, according to what the website has to offer. Search engines may well display relevant results at the top, but they can't display all the relevant results at the top. Search engine optimization allows the pages of relevant websites to be displayed at or near the top of relevant search results. And that's all it does. So what are search engines like Google so afraid of? SEOs have exactly the same aim as the engines - relevant search results. The difference is that search engines don't care about individual websites, whereas search engine optimizers and website owners do. That's the only difference. Engines don't care if a particular website is in the top 10; SEOs care very much that a particular website is in the top 10. But they can't get an offtopic site there because the search engine algorithms see to that. And that's an important point - search engine optimization can only get pages to the top of relevant results. The search engines' own algorithms keep off-topic pages out. Search engines want relevant sites at the top. As a search engine optimizer, I want relevant sites at the top, and I want one of them to be my site. Search engines can't place all the relevant sites at the top, and
they aren't going to do my site any individual favors, so I give them a hand and adjust things so that my relevant pages are at the top of the relevant results. That's all that ethical search engine optimizers do. They are not against the engines, and they don't try to achieve what the engines themselves don't want (irrelevant results). They merely strive to adjust the results by placing a different, but equally relevant site at the top. And there's nothing unethical about that.
Sites and Relevance:When we talk about relevance, we need to be clear what we are talking about. Like SEO techniques, relevance is not a black and white issue. There are lots of shades of grey. Any individual person may say they find a page relevant to a particular keyword. That doesn't mean that it is among the most relevant pages on the Web for that individual or for other individuals. This is a subjective assessment. Search engines work at the group level. To a search engine, every page in the index is relevant for every possible keyword. The question is "How relevant?" A search engine applies algorithms to determine a relevance score and orders its search results by that relevance score, most relevant first. Thus the results at the top of any set of search results are literally the most relevant. This is still a subjective assessment, as it is effectively made by the programmers of the search engine algorithms. However, as the assessment is made automatically by the algorithm, according to pre-determined criteria, there is also an element of objectivity to it. The function of a search engine is to deliver search
results in response to keywords that the individual searchers in its target market find to be relevant - in other words, for its assessment of relevance to match its users' assessment. Search engine algorithms mainly assess relevance based on the content that people will see on a page and the links that searchers will follow, both to and from a page.
When, in response to a particular keyword, a search engine scores a page low for relevance and therefore ranks it lowly, there are two methods for increasing its score and its ranking:
To apparently make the page more relevant, by deceiving the search engine algorithm that content will be seen on a page when in fact it won't or that links will be followed to or from a page when in fact they won't. This corresponds with Black Hat techniques.
To actually make the page more relevant by changing the content and link structure, but still using content that people will see and links that people will follow. This corresponds with White Hat techniques.
Optimize a Website for the Search Engines
1. Pay attention to the relevance of the contents!
There is one thing you should never forget to take into consideration when you write and publish your pages: Never tend to over optimize your contents. If there is no sense in what you are writing, it is less
useful to your visitors. Visitors may be distracted because of that, but moreover the search engines will find out about excessive use of keywords (= keyword stuffing) and the like. So always keep in mind that your texts should be relevant and useful to your actual audience.
2. Pick your main target keywords!
It is essential to find out at least one main target keyword. This is the keyword you would like your visitors to query the search engine for, and eventually see your site on the top of the result page. Your main target keyword should not be a very competitive keyword, if your business just started out. Keyword does not necessarily mean "one" word, but rather a key phrase. After you have picked your target keyword(s), you should concentrate on optimizing your web site for this keyword(s). This does not mean to stuff your site with the keyword, but rather to keep in mind to mention this keyword from time to time in a relevant sentence.
3. Optimization for different sections!
Now that you have in mind what you are optimizing for, you can go on with web site search engine optimization on, let's say, your homepage first. Search Engine Optimization is not only done on the texts you are writing. The most important parts of your page are the title, the meta description and the meta keywords, the images ALT attributes, the first body text (beside the other contents as well), the headlines (esp. H1, H2), and the anchor texts of links, amongst others. Page Title You should avoid stuffing your page's title with keywords only. This has been a wide-spread technique in past, but nowadays it is
considered spamming and you are one step further to being penalized for that. The page's title should include your business and the purpose or main point of the current page (or the website itself). The first 5 words are the most important ones, so try to include one of your main target keywords at the beginning of the tag. Anyway, only do that if the keyword is relevant to the displayed page! Do not put in senseless keywords that cannot be found anywhere else on the page. This is likely to be considered spamming, as already mentioned.
Meta Description This should also not be stuffed with keywords only. It is best practice to write one or two very relevant sentences that somewhat summarizes the contents of the page. The sentence should be grammatically correct in order to not be considered a spamming sentence. Meta Keywords Always keep in mind that you do not put in too many keywords. There are web sites around that use more than 30 keywords that is way too much. Why? Simply because too much keywords decreases the keyword density for your actual target keywords that your visitors should search for in the search engines. Besides, you should order the keywords by relevancy. Start with the most relevant main keyword and eventually finish with the less relevant ones. So your selected main target keywords should appear at the very beginning of this tag. Also keep in mind that you should not list any keywords that never appear on your web site as this may be considered spam.
The ALT attribute of images Again, spamming is to be avoided in ALT attributes. The ALT attribute has actually been intended to include alternative text that is displayed when the image cannot be shown. You should always keep in mind that the ALT attribute should rather include only one or two keywords that are actually referring to the image. If the image is an ad bar, and it has nothing to do with the contents, do NOT put in your main target keyword in the ALT attribute, as this may decrease its relevancy! Headlines (H1, H2 ...)
It is likely that more weight is put on text that appears in headlines. If your main keywords appear in the headlines, its relevancy may be increased. Anyway, never try to think of wrapping all your body text in one headline tag, this will be of no use for sure and is a way to trick the search engines. Always try to use headlines in a common useful way, as if you were writing an article for a newspaper.
Anchor Text of Internal and External Links
The anchor text apparently plays an important way for keyword relevancy too. Having internal links to pages that write something about or related to your main keywords is of great value. Also it is a good practice to embed links in a sentence. Links may receive more relevance by being logically inserted into a sentence. Anyhow you should always have something like a menu for easier use for the human visitor!
You’re Keyword Density
The keyword density is the number of the same keyword compared to the overall body text. Keyword density does not play such a big role in web site search engine optimization, as it did a year or two ago. However, increasing the keyword
density in your body text and the Meta keywords can increase the keywords relevancy a bit. Despite of that, never try to stuff your texts with the same keyword several times. A too high density may backfire its relevancy. There exist no perfect keyword density, e.g. 1%, though you should always focus on at least mention your main key word once in a body text.
Initial Promotion of Web site
If you have not yet promoted your business anywhere, and you are not yet indexed by any of the search engines, this section may be of interest to you. If you are optimizing an established web site and you do not need to do the basic promotion again, take a look at our SEO techniques to find out about several techniques to successfully excel your competitors. It is best practice to first start to build one-way links to your site, however, it is unlikely that other web sites link to your web site unless you are already trusted or have very quality contents. The best way to achieve your first one-way links is by submitting to SEO friendly web directories that do not require a reciprocal link back to their site. Before everything else, however, you should first consider submitting your site to the Open Directory Project. This is somewhat very important for new businesses, because it is a very trusted web directory that is constantly updated. It takes weeks or even months to be included in this directory though...
To achieve even more publicity you should try to gain visitors for example by implementing a blog in your website. Blogs are a great way to attract visitors and the search engines, if you update your blog gradually. If you write very quality articles about the area of your business or any other web site, it is even not unlikely that some sites will link to your article's permanent link. These are quality one-way links that increase your link popularity as well. The next step is to show others that you are an expert in your area. Forums or bulletin boards are the keys to spread your online business or web site. Be active on forums that are relevant to your business, and you'll not only draw the attention of the forum visitors and members, but also draw the attention of the search engines on your signature or relevant threads with your link inside. Having done the basic promotion you can think of advanced SEO techniques in order to increase your link popularity and quality.
Great Ways to Build Back links for a Website
One of the most important factors that influence the search engine ranking of your site is the number of back links you have. Almost every search engine takes the number of back links into account while evaluating and ranking your site in the SERPs. The number of back links is the most important factor for ranking well in Google.
Back Links:Back links are basically incoming links to a particular website or webpage. The more the number of back links a website or a webpage
possess the more popular or important the website is. Back links are also popularly known as incoming links inbound links, in links and inward links. Ever wondered why back links are so very important in search engine rankings? Well, almost all the search engines, especially Google value websites that are rich in useful and informative content. And whenever a site places a link to your site, Google considers that link as a vote to your site. It feels that the other site had cast a vote to your site because the content of your site is useful to its visitors. Hence greater the number of votes, greater is the value that your site has from the search engine's perspective. And if you are hankering for page rank, then back links is the only way to go. Back links help your website attain a higher search engine ranking and definitely a higher page rank.
Ways to Create Back links:
One of the best ways of creating one-way links for your site is through the posting of comments in blogs. Just include the URL of your site while posting comments in blogs and soon you will build a good many number of one-way links for your site.
Posting in forums is also an excellent way for building back links. But don't post your URL directly in the discussion boards or else the spam busters would get you banned and removed from the forum. The best way to do it is through your signature. Include the URL of your site in your signature and then whenever you post a comment in the forum you will be leaving behind a link to your site along with your name.
Since the evolution of WEB2.0 social networking sites like MySpace, Friendster, Tag world etc are the hottest zones on the Internet today. Millions of people visit these sites daily for making friends or for promoting their products or services. You may also register at such sites and post links to your site along with the comments that you make at your friend's profile. In this way you may create large number of back links for your site.
Social bookmarking is another area where you could create one-way links to your site. Just bookmark your sites in social bookmarking sites like Delicious and if your site is interesting enough, others will add your site to their list of bookmarked sites and soon you will garner a large number of one-way links for your site.
You may write articles on various topics and post them in article sites such as this one and include the URL of your site in the resource box or in the author's bio section. There are many sites which frequently publish articles from various such article sites and along with the article they also publish the author’s bio which contains the link to your site. If your article is informative and intriguing, then it will soon be published in many other sites and newsletters. This way you will have a large number of back links pointing to your site.
As soon as you some up with a new feature on your website, do a press release on it. Press releases are very efficient for website promotion and also for creating oodles of one-way links. Write a press release about any unique feature of your site and submit the release in various press release sites. Press release sites regularly syndicate their content through RSS and Atom feeds and many leading news sites fetch their content from all these press release sites. So you may expect a very good response from doing a single press release. Directories:-
Submit you website in as many web directories as you can. These web directories attract a large number of visitors daily and so you may get a good number of visitors from such directories. Submission in directories also helps in building quality back links for your website. Consider submitting your website in leading web directories such as Dmoz.org
Though it is a very old method for building traffic and back links, link exchange still works. Exchange links with quality and relevant sites for building back links. The above mentioned methods will help you build umpteen numbers of back links for your site. But always remember that the content of your site is what really matters. Concentrate on making exceptionally good content for your site and others will voluntarily link to your site.
What is a Sitemap?
A sitemap displays the inner framework and organization of your site's content to the search engines. Your sitemap should reflect the way visitors would intuitively work through your site. Years ago sitemaps existed only as a boring series of links in list form. Today, they are thought of as an extension of your site. You should use your sitemap as a tool to provide your visitor and the search engines with more content. Create details for each section and sub-section through descriptive text placed under the sitemap link. This will help your visitors understand and navigate through your site, and will also give you more food for the search engines. You can even go crazy and add Flash to your sitemap like we did with the interactive Bruce Clay sitemap! Of course, if you do include a Flash sitemap for your visitor, you will also need to include a text map so that the robots can read it.
A good site map will:
* Show a quick, easy to follow overview of your site. * Provide a pathway for the search engine robots to follow. * Provide text links to every page of your site. * Quickly show visitors how to get where they need to go. * Give visitors a short description of what they can expect to find on each page. * Utilize important keyword phrases.
Why They Are Important?
Sitemaps are very important for two main reasons. First, your sitemap provides food for the search engine spiders that crawl your site. The sitemap will give the spider links to all the major pages of your site, allowing every page included on your sitemap to be indexed by the spider. This is a very good thing! Having all of your major pages included in the search engine database will make your site more likely to come up in the search engine results when a user performs a query. Your sitemap pushes the search engine toward the individual pages of your site instead of making them hunt around for links. A well planned site map can ensure your Web site is fully indexed by search engines. Sitemaps are also very valuable for you human visitors. They help them to understand your site structure and layout, while giving them quick access to your entire site. It is also helpful for lost users in need of a lifeline. Often if a visitor finds themselves lost or stuck inside your page, he will begin to look for a way out of his hole. Having a detailed sitemap will
show him how to get back on track and find what he was looking for. Without it, your visitor would have just closed the browser or headed back over to the search engines. Conversion lost.
Tips for Creating a Sitemap
Your sitemap should be linked from your homepage. Linking it this way will force search engines to find it that way and then follow it all the way through the site. If it's linked from other pages it is likely the spider will find a dead end along the way and just quit. Small sites can place every page on their sitemap, but larger sites should not. You do not want the search engines to see a never-ending list of links and assume you are a link farm. Most SEO experts believe you should have no more than 25 to 40 links on your sitemap. This will also make it easier to read for your human visitors. Remember, your sitemap is there to assist your visitors, not confuse them. The title of each link should contain a keyword whenever possible and should link to the original page. We recommend writing a short description (10-25) words under each link to help visitors learn what the page is about. Having short descriptions will also contribute to your depth of content with the search engines. Once created, go back and make sure that all of your links are correct. If you have 15 pages on your sitemap, then all 15 pages need to link to every other sitemap page. Otherwise both visitors and search engine spiders will find broken links and lose interest.
Remember to Update!
Just like you can't leave your website to fend for it, the same applies to your sitemap. When your site changes, make sure your sitemap is updated to reflect that. What good are directions to a place that's been
torn down? Keeping your sitemap current will make you an instant visitor and search engine favorite.
The TWO METHODS mainly work for increasing Website’s score and its ranking.
Any search engine positioning tactic that maintains the integrity of your website and the SERPs (search engine results pages) is considered a "white-hat" search engine positioning tactic. These are the only tactics that we will use whenever applicable and which enhance rather than detract from your website and from the rankings
White-Hat Search Engine Positioning Tactics:
1. Internal Linking
By far one of the easiest ways to stop your website from ranking well on the search engines is to make it difficult for search engines to find their way through it. Many sites use some form of script to enable fancy dropdown navigation, etc. Many of these scripts cannot be crawled by the search engines resulting in UN indexed pages. While many of these effects add visual appeal to a website, if you are using scripts or some other form of navigation that will hinder the spidering of your website it is important to add text links to the bottom of at least your homepage linking to all you main internal pages including a sitemap to your internal page.
2. Reciprocal Linking
Exchanging links with other webmasters is a good way (not the best, but good) of attaining additional incoming links to your site. While the value of reciprocal links has declined a bit over the past year they certainly still do have their place. A VERY important note is that if you do plan on building reciprocal links it is important to make sure that you do so intelligently. Random reciprocal link building in which you exchange links with any and virtually all sites that you can will not help you over the long run. Link only to sites that are related to yours and whose content your visitors will be interested in and preferably which contain the keywords that you want to target. Building relevancy through association is never a bad thing unless you're linking to bad neighborhoods (penalized industries and/or websites).
3. Content Creation
Don't confuse "content creation" with doorway pages and such. When we recommend content creation we are discussing creating quality, unique content that will be of interest to your visitors and which will add value to your site. The more content-rich your site is the more valuable it will appear to the search engines, your human visitors, and to other webmasters who will be far more likely to link to your website if they find you to be a solid resource on their subject. Creating good content can be very timeconsuming; however it will be well worth the effort in the long run. As an additional bonus, these new pages can be used to target additional keywords related to the topic of the page.
4. Writing for Others
You know more about your business that those around you so why not let everyone know? Whether it be in the form of articles, forum posts, or a spotlight piece on someone else's website, creating content that other people will want to read and post on their sites is one of the best ways to build links to your website that don't require a reciprocal link back.
5. Site Optimization
The manipulation of your content, wording, and site structure for the purpose of attaining high search engine positioning is the backbone of SEO and the search engine positioning industry. Everything from
creating solid title and Meta tags to tweaking the content to maximize its search engine effectiveness is key to any successful optimization effort. That said, it is of primary importance that the optimization of a website not detract from the message and quality of content contained within the site. There's no point in driving traffic to a site that is so poorly worded that it cannot possibly convey the desired message and which thus, cannot sell. Site optimization must always take into account the maintenance of the salability and solid message of the site while maximizing it's exposure on the search engines.
Constantly webmasters attempt to "trick" the search engines into ranking sites and pages based on illegitimate means. Whether this is through the use of doorway pages, hidden text, interlinking, keyword spamming or other means they are meant to only trick a search engine into placing a website high in the rankings. Because of this, sites using black-hat tactics tend to drop from these positions as fast as they climb
Black-Hat Search Engine Positioning Tactics:
This is probably one of the most commonly abused forms of search
engine spam. Essentially this is when a webmaster or SEO places a large number of instances of the targeted keyword phrase in hopes that the search engine will read this as relevant. In order to offset the fact that this text generally reads horribly it will often be placed at the bottom of a page and in a very small font size. An additional tactic that is often associated with this practice is hidden text which is commented on below.
Hidden text is text that is set at the same color as the background or very close to it. While the major search engines can easily detect text set to the same color as a background some webmasters will try to get around it by creating an image file the same color as the text and setting the image file as the background. While undetectable at this time to the search engines this is blatant spam and websites using this tactic are usually quickly reported by competitors and the site blacklisted.
In short, cloaking is a method of presenting different information to the search engines than a human visitor would see. There are too many methods of cloaking to possibly list here and some of them are still undetectable by the search engines. That said, which methods still work and how long they will is rarely set-in-stone and like hidden text, when one of your competitors figures out what is being done (and don't think they aren't watching you if you're holding one of the top search engine positions) they can and will report your site and it will get banned.
Doorway pages are pages added to a website solely to target a specific
keyword phrase or phrases and provide little in the way of value to a visitor. Generally the content on these pages provide no information and the page is only there to promote a phrase in hopes that once a visitor lands there, that they will go to the homepage and continue on from there. Often to save time these pages are generated by software and added to a site automatically. This is a very dangerous practice. Not only are many of the methods of injecting doorway pages banned by the search engines but a quick report to the search engine of this practice and your website will simply disappear along with all the legitimate ranks you have attained with your genuine content pages.
Redirecting, when used as a black-hat tactic, is most commonly brought in as a compliment to doorway pages. Because doorway pages generally have little or no substantial content, redirects are sometime applied to automatically move a visitor to a page with actual content such as the homepage of the site. As quickly as the search engines find ways of detecting such redirects, the spammers are uncovering ways around detection. That said, the search engines figure them out eventually and your site will be penalized. That or you'll be reported by a competitor or a disgruntled searcher.
6. Duplicate Sites
A throwback tactic that rarely works these days. When affiliate programs became popular many webmasters would simply create a copy of the site they were promoting, tweak it a bit, and put it online in hopes that it would outrank the site it was promoting and capture their sales. As the search engines would ideally like to see unique content across all of their results this tactic was quickly banned and the search engines have
methods for detecting and removing duplicate sites from their index. If the site is changed just enough to avoid automatic detection with hidden text or such, you can once again be reported to the search engines and be banned that way.
Search Engine Optimization has become a Hottest Topic in the World of INTERNET Nowadays.
The Internet! There has never been a venue of this magnitude to reach so many potential customers in the history of mankind. Currently, there are over 391 million people online throughout the world. This number is growing larger daily at an unprecedented rate. Internet studies now show that over 85% of the millions people online use search engines on a daily basis when they’re surfing the Internet. With this in mind, it is no mystery why search engine optimization has become one of the hottest topics within Internet marketing. With millions of people online, the amount of daily search engine traffic is staggering, and profitable to those to take advantage of it. For an individual or company who is thinking of marketing their products or services, the Internet can be a virtually limitless gold mine of potential customers at a relatively very low acquisition cost. However, to reach these customers you will need to position yourself on those same search engines that hundreds of millions of people are using
daily looking for your services or products. When queried, a search engine will reveal all of the web sites it has in its database about a particular subject or product. If your web site happens to fall into the requested search category, this is called your web site ranking. However, usually only the first few (top 10) ranked websites get to reap the sweet fruit of profit and victory over their competition. Your web site could be number 1 or number 2001 on a particular search engine for a particular search term or keyword. Some search engines will reveal hundreds of thousands of web pages on a search topic, but when a web site is developed with search engine rankings in mind and the goal to appear in the first few results of certain search keywords, it is called search engine optimization or (SEO).
I S T A K E S
A K E
S E O : -
Many web designers fall into the trap of designing a fancy website that looks great but includes elements that will cripple its search engine rankings. It is a huge waste of time, effort, and sometimes money to create a beautiful looking site that does not attract any visitors. What good is all that beauty if no one can find it? Here are some common design elements that should be avoided whenever possible.
• Not having a < title > for the webpage When a user searches for some information on a search engine, they usually have a choice of which link to click. Users would scan through the titles and have a clear first impression of your site on their mind. A boring title will most likely be ignored. Apparently the title tag is an important part of your site. Search engines do display the subject in the title tag. Not having a title tag will be harmful to the ranking of your website. Also, title tags such as ‘welcome to my site’ or ‘my world’
which are irrelevant are suicidal towards gaining a good rank. It is vital to have a title that truly reflects your site. However, try not to have a title that is too long, as there is a limit that most search engines can abridge to. • Irrelevant keywords There is a case of too much keyword that is unrelated to your site. For example, sites about selling carpeting equipment with keywords like “David Beckham”. Sure, tons of individuals that are searching for David Beckham will end up at the site, just to realize there is not connection with “David Beckham”. Nothing is gain from such traffic. On the other hand, having the right keywords can be essential to bring in more potential customers. Make sure that the keyword used is specific and it gives an overall picture to the whole site. • Having too much graphics Beginners often like to include a lot of pictorial content, sometimes even flash animation. To include vast number of graphics or even flash animations can do one’s site no good. It might look pleasing to a human eye, but means nothing to a search engine. The search engines regrettably are not able to index content or navigation that is embedded in a flash file. A huge quantity of pictures will also not help your site’s ranking at all. For another simple reason, the site might take nearly forever to load. It is a lot wiser to have more keyword content than having a page flooded with graphics. • Not enough links If your site has a very limited amount of ingoing and outgoing links, chances are your traffic can be extremely limited. The best way to do this is to exchange links with other sites. Links on other websites to your own plays an important part to increase your page rank of search
engines. However, it is also important to have links that are from good quality sites instead of low quality and banned sites. If not it will not contribute much worth to your site.
Using images instead of text links for your site's navigation menu can confuse search engine bots and may make your site difficult for them to spider completely. If you must use graphic elements for links, be sure to also include a set of text links for the spiders to follow. Many sites do this in their footer, either above or below the copyright information.
If you use frames to display your site, you run the risk that the search engine's spider will not pick up all of your content. Most bots will only spider the first HTML file they encounter, ignoring your other frames. If you use frames and notices that only a fraction of your pages make it into the search engine's index, the frames could be the problem. Also, many users are turned off by the sight of multiple frames and will not stick around your site long enough to purchase your product or click an advertisement.
Splash Intro Pages:-
Splash introduction or entry pages, often using flash animation, are another SEO pitfall for many new webmasters. While the graphics and animation may look impressive to human visitors, they lack any useful content for the search engine spiders. Making this even worse is the fact that the splash page comes before your index page, which is the most important page to a search engine bot. Many times, splash pages also include an automatic redirect if the introduction is not skipped. Search engines usually do not index redirects and there is always the possibility that a new algorithm will penalize sites for this.
In order to get high rankings in the search engines, first they have to find your site, and then they have to be able to read enough of your content to rank the site. Eliminating style elements that confuse or restrict the spiders will help make your website more accessible and easier to index. If you must use things like image links, make sure to also have an alternate set of text links somewhere on the page for the spiders to follow. More dramatic elements like frames and splash pages should be avoided altogether if you want to get the best possible search engine ranking for your site.
It is very bad practice that you use other website's contents. If you site contains exactly the same text as another web site, and Google figured out that you updated your web page later than the other site, Google and other search engines consider your site to be spamming. Duplicate contents will certainly not result in an immediate penalty, however, you ranking may be negatively affected for sure. Furthermore it is even not recommended to have duplicate contents, i.e. bigger body texts, on several pages of your web site. That means if you have 3 links that all link to pages that are differently named, but contain the same information, you are likely to be considered tricking the search engines as well. This may result in a loss of relevance...
Over Optimization:Over optimization is reached when you are putting too much weight on your web site search engine optimization. That means when you are, for example, having no real relevant contents for your visitors but rather keyword stuffed body text. Having a too high keyword density may increase the likeliness of getting an Over Optimization Penalty.
This also applies to your page title, your Meta description and Meta keywords, as well as the ALT attributes of your images. None of the named objects should be stuffed with keywords nor be irrelevant. The title, for example, should always reflect the contents of the page. Although you won't be penalized for just having lots of keywords in the title, you will see a decrease in page relevancy that possibly results in a loss of your search engine ranking. Therefore, avoid optimizing your pages too intensely by all means.
COMMON SEO TERMS USED IN THE WORLD OF SEO:-
Search Engine Optimization (SEO) has become an essential weapon in the arsenal of every online business. Unfortunately, for most business owners and marketing managers (and even many webmasters), it's also somewhat of an enigma. This is partly due to the fact that it's such a new and rapidly changing field, and partly due to the fact that SEO practitioners tend to speak in a language all of their own which, without translation, is virtually impenetrable to the layperson. This glossary seeks to remedy that situation, explaining specialist SEO terms in plain English...
A complex mathematical formula used by search engines to assess the relevance and importance of websites and rank them accordingly in their search results. These algorithms are kept tightly under wraps as they are the key to the objectivity of search engines (i.e. the algorithm ensures relevant results, and relevant results bring more users, which in turn brings more advertising revenue).
The submitting of free reprint articles too many article submission sites and article distribution lists in order to increase your website's search engine ranking and Google Page Rank. (In this sense, the "PR" stands for Page Rank.) Like traditional public relations, article PR also conveys a sense of authority because your articles are widely published. And because you're proving your expertise and freely dispensing knowledge, your readers will trust you and will be more likely to remain loyal to you. (In this sense, the "PR" stands for Public Relations.)
ARTICLE SUBMISSION SITES
Websites which act as repositories of free reprint articles. They are sites where authors can submit their articles free of charge, and where webmasters can find articles to use on their websites free of charge. Article submission sites generate revenue by selling advertising space on their websites.
A text link to your website from another website
The words used on your website.
A professional writer who specializes in the writing of advertising copy (compelling, engaging words promoting a particular product or service).
Google finds pages on the World Wide Web and records their details in its index by sending out ‘spiders’ or ‘robots’. These spiders make their way from page to page and site to site by following text links. To a spider, a text link is like a door.
The virtual address of your website (normally in the form www.yourbusinessname.com). This is what people will type when they want to visit your site. It is also what you will use as the address in any text links back to your site.
An electronic magazine. Most publishers of ezines are desperate for content and gladly publish well written, helpful articles and give you full credit as author, including a link to your website.
A technology used to create animated web pages (and page elements).
FREE REPRINT ARTICLE
An article written by you and made freely available to other webmasters to publish on their websites.
The search engine with the greatest coverage of the World Wide Web, and which is responsible for most search engine-referred traffic. Of approximately 11.5 billion pages on the World Wide Web, it is estimated that Google has indexed around 8.8 billion. This is one reason why it takes so long to increase your ranking!
GOOGLE PAGE RANK
How Google scores a website’s importance. It gives all sites a mark out of 10. By downloading the Google Toolbar, you can view the PR of any site you visit.
A free tool you can download. It becomes part of your browser toolbar. It’s most useful features are it’s Page Rank display (which allows you to view the PR of
any site you visit) and it’s AutoFill function (when you’re filling out an online form, you can click AutoFill, and it enters all the standard information automatically, including Name, Address, Zip code/Postcode, Phone Number, Email Address, Business Name, Credit Card Number (password protected), etc.) Once you’ve downloaded and installed the toolbar, you may need to set up how you’d like it to look and work by clicking Options (setup is very easy). NOTE: Google does record some information (mostly regarding sites visited).
HTML (Hypertext Markup Language) is the coding language used to create much of the information on the World Wide Web. Web browsers read the HTML code and display the page that code describes.
An interconnected network of computers around the world.
A programming language used to create dynamic website pages (e.g. interactivity).
A word which your customers search for and which you use frequently on your site in order to be relevant to those searches. This use known as targeting a keyword. Most websites actually target ‘keyword phrases’ because single keywords are too generic and it is very difficult to rank highly for them.
A measure of the frequency of your keyword in relation to the total wordcount of the page. So if your page has 200 words, and your keyword phrase appears 10 times, its density is 5%.
A phrase which your customers search for and which you use frequently on your site in order to be relevant to those searches.
A word or image on a web page which the reader can click to visit another page. There are normally visual cues to indicate to the reader that the word or image is a link.
Using text links to connect a series of page (i.e. page 1 connects to page 2, page 2 connects to page 3, page 3 connects to page 4, and so on). Search engine ‘spiders’ and ‘robots’ use text links to jump from page to page as they gather information about it, so it’s a good idea to allow them traverse your entire site via text links.
A webmaster that is willing to put a link to your website on their website. Quite often link partners engage in reciprocal linking.
The number of links to your website. Link popularity is the single most important factor in a high search engine ranking. Webmasters use a number of methods to increase their site's link popularity including article PR, link exchange (link partners / reciprocal linking), link buying, and link directories.
The part of a text link that is visible to the reader. When generating links to your own site, they are most effective (in terms of ranking) if they include your keyword.
A short note within the header of the HTML of your web page which describes some aspect of that page. These meta tags are read by the search engines and used to help assess the relevance of a site to a particular search.
NATURAL SEARCH RESULTS
The ‘real’ search results. The results that most users are looking for and which take up most of the window. For most searches, the search engine displays a long list of links to sites with content which is related to the word you searched for. These results are ranked according to how relevant and important they are.
Your position in the search results that display when someone searches for a particular word at a search engine.
A mutual agreement between two webmasters to exchange links (i.e. they both add a link to the other’s website on their own website). Most search engines (certainly Google) are sophisticated enough to detect reciprocal linking and they don’t view it very favorably because it is clearly a manufactured method of generating links. Websites with reciprocal links risk being penalized.
A file which is used to inform the search engine spider which pages on a site should not be indexed. This file sits in your site’s root directory on the web server. (Alternatively, you can do a similar thing by placing tags in the header section of your HTML for search engine robots/spiders to read.
Many SEO experts believe that Google ‘sandboxes’ new websites. Whenever it detects a new website, it withholds its rightful ranking for a period while it determines whether your site is a genuine, credible, long term site. It does this to discourage the creation of SPAM websites (sites which serve no useful purpose other than to boost the ranking of some other site). Likewise, if Google detects a sudden increase (i.e. many hundreds or thousands) in the number of links back to your site, it may sandbox them for a period (or in fact penalize you by lowering your ranking or blacklisting your site altogether).
Search Engine Optimization. The art of making your website relevant and important so that it ranks high in the search results for a particular word.
A ‘copywriter’ who is not only proficient at web copy, but also experienced in writing copy which is optimized for search engines (and will therefore help you achieve a better search engine ranking for your website).
A search engine is an online tool which allows you to search for websites which contain a particular word or phrase. The most well known search engines are Google, Yahoo, and MSN.
A single page which contains a list of text links to every page in the site (and every page contains a text link back to the site map). Think of your site map as being at the center of a spider-web.
Generally refers to unwanted and unrequested email sent en-masse to private email addresses. Also used to refer to websites which appear high in search results without having any useful content. The creators of these sites set them up simply to cash in on their high ranking by selling advertising space, links to other sites, or by linking to other sites of their own and thereby increasing the ranking of those sites. The search engines are becoming increasingly sophisticated, and already have very efficient ways to detect SPAM websites and penalize them.
Google finds pages on the World Wide Web and records their details in its index by sending out ‘spiders’ or ‘robots’. These spiders make their way from page to page and site to site by following text links.
Paid advertising which displays next to the natural search results. Customers can click on the ad to visit the advertiser’s website. This is how the search engines make their money. Advertisers set their ads up to display whenever someone searches for a word which is related to their product or service. These ads look similar to the natural search results, but are normally labeled “Sponsored Links”, and normally take up a smaller portion of the window. These ads work on a PayPer-Click (PPC) basis (i.e. the advertiser only pays when someone clicks on their ad).
You can submit your domain name to the search engines so that their ‘spiders’ or ‘robots’ will crawl your site. You can also submit articles to ‘article submission sites’ in order to have them published on the Internet.
A word on a web page which the reader can click to visit another page. Text links are normally blue and underlined. Text links are what ‘spiders’ or ‘robots’ use to jump from page to page and website to website.
Uniform Resource Locator. The address of a particular page published on the Internet. Normally in the form http://www.yourbusinessname.com/AWebPage.htm.
A ‘copywriter’ who understands the unique requirements of writing for an online medium.
A person responsible for the management of a particular website.
The number of words on a particular web page.
World Wide Web (WWW)
The vast array of documents published on the Internet. It is estimated that the World Wide Web now consists of approximately 11.5 billion pages.
WE ARE NOT CLAIM THAT OUR WORK OUR EFFORTS ARE ALWAYS UNIQUE.
UNIQUE, BUT WE CLAIM
C A N C L U S I O N : The process of SEO is not easy to tackle, largely because so many pieces of a site factor into the final results. Promoting a site that writers on the web are unlikely to link to is as deadly as creating a fantastic website no one will see. SEO is also a long-term process, both in application and results - those who expect quick rankings after completing a few suggestions in this guide will be deeply disappointed. Search engines can often be frustratingly slow to respond to improvements that will eventually garner significant boosts in traffic. Search engine optimization is a complex task that requires a steady effort. To be most effective, your optimization campaign should be consistent. SEO is a never-ending process. It’s hard to keep up with the different techniques that are used to boost a site’s ranking on search engines. Laying the ground works for SEO will help once a site starts to get
index. Doing so before the site is indexed will aid tremendously in the sites ranking. Effective SEO strategy can reap significant dividends and is important. The impact of SEO is undisputed. Effective programs drive traffic and show demonstrable and great returns. Always keep in mind that SEO is an ongoing process and can be incorporated into your annual online marketing budget along with things like regular website updates, newsletter mailings to site subscribers and online and offline media planning. Always be aware of the experience of your top, loyal customers and their needs since you don’t want to lose what you have to attract more customers. The optimization of your web site results in the increase in the number of targeted visitors, improvement in your sales and customer loyalty, and of course, increases the brand recognition. An optimization campaign also takes time. Search engines may not see or react to changes you’ve made on your site or links you’ve received for months. For very small companies, it may be smart to run your own optimization campaign. But for most businesses, it is smart to use a professional search engine optimization company. Patience is not the only virtue that should be used for successful SEO. The strategy itself must have a strong foundation in order to succeed. The best sites adhere strictly to these guidelines:
Unique Content - Something that has never before been offered on the web in terms of depth, quality, or presentation. Access to an Adoptive Community - Connections or alliances with people/websites in an existing online community that is ready to accept, visit, and promote your offering. Link-Friendly Formatting - Even the best content may be unlikely to be linked to if it displays ads, particularly those that break up the page content or pop-up when a visitor comes to the site. Use
discretion in presenting your material and remember that links are one of the most valuable commodities a site/page can get, and they'll last far longer than a pop-up ad's revenue. Monetization Plan - Intelligent systems for monetizing powerful content must exist, or bandwidth, hosting, and development costs will eventually overrun your budget.
Market Awareness - If your site is targeting highly competitive terms, you should make available an online marketing budget, including funds for link buying, and hire or consult with someone experienced in bringing newer sites to the top of the SERPs.
I B L I O G R A P H Y
E F E R E N C E S
We did learn and implement in parallel, so here is the list of sources from which we have taken the guidance. B : -
O O K S
SEO E-BOOK. By: - Brad Collan.
PRACTICAL SEO TECHNIQUIES. By: - Macronimous Web Solutions.
101 TIPS ON SEO.
By: - Perfect Optimization Company.
E B S I T E S
www.searchenginechannel.com www.searchenginewatch.com www.seobook.com www.seoworld.com www.searchengines.com