You are on page 1of 43

WEB SEARCH-ENGINES

PREKSHA MANGAL
B-TECH
CS-3rd year
CONTENTS

 INTRODUCTION
 HISTORY
 USE OF SEARCH ENGINES
 TYPES OF SEARCH ENGINES &
WORKING
 COMPARISION OF TOP SEARCH
ENGINES
 CONCLUSION
INTRODUCTION
 Search engine is a software program that
searches for sites based on the words that you
designate as search terms.

 Search Engines look through there own


databases to find what it is that you are
looking for.

 These are information retrieval systems.


HISTORY
 ARCHIE- First search tool on I nternet in

 GOPHER- indexed plain text documents

 JUGHEAD- searched the files stored in


gopher index

 WANDEX- First Web Search Engine


Use of Search Engines…
TYPES OF SEARCH ENGINES
HUMAN POWERED
DIRECTORIES

CRAWLER BASED

HYBRID
HUMAN POWERED
DIRECTORIES
They are staffed by human editors
who consider every new website
submitted and if they decide it is
acceptable , assign it to appropriate
category.

Eg: Yahoo search engine, open


directory project
WORKING…

It specializes in linking to other websites and categorizing


those links.

It lists websites by category and subcategory. These website


entries are found by humans.

The categorization is based on whole website rather than one


page or set of keywords.

You submit a short description to the directory for your entire


site. A search looks for matches only in descriptions
submitted.
MERITS:
It is useful for “hidden information”.

DEMERITS:
This is slow as requires lot of human resources.
It is expensive and subjective.
CRAWLER BASED
SEARCH ENGINES
They create their listings automatically. They “crawl”
or “spider” the web, then people search through what
they have found.

Eg :- Google search engine


Working …..

Web Crawling

Indexing

Searching
STEP 1: Web Crawling

It is the process of scanning websites to add new


pages and to update the existing ones.

Crawlers are given an initial set URL’s whose


pages they retrieve. They extract URL’s that appear
on crawled pages and give its information to
Crawler Control Module.

Crawler Control module decides which pages to


visit and give their URL’s back to crawlers.
STEP 2: Indexing

The index sometimes called “catalog” is a giant book like


copy of every web page that the spider finds. If a webpage
changes, this book is updated with new information.
Extracts words from each page it
visits

Records the URLS into the LOOK


UP TABLE

Points to pages covered in crawling


process
STEP 3: Searching
Hybrid Search Engines
In the web’s early days, it is used to be that a search
engine either presented crawler based results or
human powered listings.

Today , it is extremely common for both type of


results to be presented.

Usually, a hybrid search engine will favor one type


of listings over another.

Eg:- MSN Search Engine


META SEARCH - ENGINES
Meta Search Engines search multiple engines
getting combined results from multiple engines.

They do not have their own databases but have


their own business models affecting results.
COMPARING THE TOP
SEARCH ENGINES
3 TOP SEARCH ENGINES ARE:

GOOGLE

YAHOO

MSN
GOOGLE: 57 %

YAHOO : 22%

MSN : 8%

OTHER ENGINES :13%


 Google regarded as best search engine.

 Yahoo has been one of oldest powerhouses known on internet.

 MSN Live search by Microsoft formerly known as ‘MSN Search’ was used by
it’s own community and was not much known.

Results Fetching Speed


[These tests are done on 512 Kbps shared (56.6 Kbps Core)]

Keyword used: Silicon Valley


Google - 0.13 seconds to fetch 53,200,000 pages
Yahoo - 0.17 sec seconds to fetch 35,300,000 pages
MSN Live - 0.13 seconds to fetch 6,061,755 pages

The fastest Search engine therefore no doubt is Google.


SEARCH FEATURES:-

Google:
Offers limited Boolean searching
Has built in powerful calculator

Yahoo:
Offers better Boolean searching than Google
Yahoo has good built in calculator as well

MSN:
Offers full Boolean searching.
Has built in calculator but proves to be insufficient most of times.

BETTER SEARCHING FEATURES ARE FOUND IN YAHOO


ALGORITHM, TECHNOLOGY AND QUALITY
CODING
 These 3 companies do not give their source codes so which has better searching
technology can never be found out.
 But it was found when various cached pages were analyzed from 3 search
engines that “Google has better search technology than other 2 and its
programmers are working more seriously”.

GOOGLE:

Google shows the message “Google is not affiliated to the authors of this page or
responsible for its content” when other search engines are entered as query.
But with it’s own cached page no such message is displayed.
Yahoo:

When it’s cached page was analyzed it gave the message


“YAHOO is not affiliated to the authors of it pages or responsible for its content”.

MSN:

When it’s cached page was analyzed it gave the message


“MSN is not affiliated to the authors of its page or responsible for its content”
CONCLUSION
 Though there are many search engines available on the web,
the searching methods and the engines need to go a long way
for efficient retrieval of information on relevant topics.

 N one of the search engines out today are perfect but using
the right one at the right time makes all the difference.
THANK YOU
REFERNCES

www.howstuffworks.com

www.scribd.com

www.searchenginewatch.com

www.buzzle.com

You might also like