Web Search-Engines: Preksha Mangal B-Tech CS-3 Year

WEB SEARCH-ENGINES
PREKSHA MANGAL
B-TECH
CS-3rd year
CONTENTS
 INTRODUCTION
 HISTORY
 USE OF SEARCH ENGINES
 TYPES OF SEARCH ENGINES &
WORKING
 COMPARISION OF TOP SEARCH
ENGINES
 CONCLUSION
INTRODUCTION
 Search engine is a software program that
searches for sites based on the words that you
designate as search terms.
 Search Engines look through there own

databases to find what it is that you are
looking for.
 These are information retrieval systems.

HISTORY
 ARCHIE- First search tool on I nternet in
 GOPHER- indexed plain text documents
 JUGHEAD- searched the files stored in

gopher index
 WANDEX- First Web Search Engine

Use of Search Engines…
TYPES OF SEARCH ENGINES
HUMAN POWERED
DIRECTORIES
CRAWLER BASED
HYBRID
HUMAN POWERED
DIRECTORIES
They are staffed by human editors
who consider every new website
submitted and if they decide it is
acceptable , assign it to appropriate
category.
Eg: Yahoo search engine, open

directory project
WORKING…
It specializes in linking to other websites and categorizing

those links.
It lists websites by category and subcategory. These website

entries are found by humans.
The categorization is based on whole website rather than one

page or set of keywords.
You submit a short description to the directory for your entire

site. A search looks for matches only in descriptions
submitted.
MERITS:
It is useful for “hidden information”.
DEMERITS:
This is slow as requires lot of human resources.
It is expensive and subjective.
CRAWLER BASED
SEARCH ENGINES
They create their listings automatically. They “crawl”
or “spider” the web, then people search through what
they have found.
Eg :- Google search engine

Working …..
Web Crawling
Indexing
Searching
STEP 1: Web Crawling
It is the process of scanning websites to add new

pages and to update the existing ones.
Crawlers are given an initial set URL’s whose

pages they retrieve. They extract URL’s that appear
on crawled pages and give its information to
Crawler Control Module.
Crawler Control module decides which pages to

visit and give their URL’s back to crawlers.
STEP 2: Indexing
The index sometimes called “catalog” is a giant book like

copy of every web page that the spider finds. If a webpage
changes, this book is updated with new information.
Extracts words from each page it
visits
Records the URLS into the LOOK

UP TABLE
Points to pages covered in crawling

process
STEP 3: Searching
Hybrid Search Engines
In the web’s early days, it is used to be that a search
engine either presented crawler based results or
human powered listings.
Today , it is extremely common for both type of

results to be presented.
Usually, a hybrid search engine will favor one type

of listings over another.
Eg:- MSN Search Engine

META SEARCH - ENGINES
Meta Search Engines search multiple engines
getting combined results from multiple engines.
They do not have their own databases but have

their own business models affecting results.
COMPARING THE TOP
SEARCH ENGINES
3 TOP SEARCH ENGINES ARE:
GOOGLE
YAHOO
MSN
GOOGLE: 57 %
YAHOO : 22%
MSN : 8%
OTHER ENGINES :13%

 Google regarded as best search engine.
 Yahoo has been one of oldest powerhouses known on internet.
 MSN Live search by Microsoft formerly known as ‘MSN Search’ was used by
it’s own community and was not much known.
Results Fetching Speed

[These tests are done on 512 Kbps shared (56.6 Kbps Core)]
Keyword used: Silicon Valley

Google - 0.13 seconds to fetch 53,200,000 pages
Yahoo - 0.17 sec seconds to fetch 35,300,000 pages
MSN Live - 0.13 seconds to fetch 6,061,755 pages
The fastest Search engine therefore no doubt is Google.

SEARCH FEATURES:-
Google:
Offers limited Boolean searching
Has built in powerful calculator
Yahoo:
Offers better Boolean searching than Google
Yahoo has good built in calculator as well
MSN:
Offers full Boolean searching.
Has built in calculator but proves to be insufficient most of times.
BETTER SEARCHING FEATURES ARE FOUND IN YAHOO

ALGORITHM, TECHNOLOGY AND QUALITY
CODING
 These 3 companies do not give their source codes so which has better searching
technology can never be found out.
 But it was found when various cached pages were analyzed from 3 search
engines that “Google has better search technology than other 2 and its
programmers are working more seriously”.
GOOGLE:
Google shows the message “Google is not affiliated to the authors of this page or
responsible for its content” when other search engines are entered as query.
But with it’s own cached page no such message is displayed.
Yahoo:
When it’s cached page was analyzed it gave the message

“YAHOO is not affiliated to the authors of it pages or responsible for its content”.
MSN:
When it’s cached page was analyzed it gave the message

“MSN is not affiliated to the authors of its page or responsible for its content”
CONCLUSION
 Though there are many search engines available on the web,
the searching methods and the engines need to go a long way
for efficient retrieval of information on relevant topics.
 N one of the search engines out today are perfect but using
the right one at the right time makes all the difference.
THANK YOU
REFERNCES
www.howstuffworks.com
www.scribd.com
www.searchenginewatch.com
www.buzzle.com

Web Search-Engines: Preksha Mangal B-Tech CS-3 Year

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Web Search-Engines: Preksha Mangal B-Tech CS-3 Year

Uploaded by

Copyright:

Available Formats

WEB SEARCH-ENGINES

 Search Engines look through there own

 These are information retrieval systems.

 GOPHER- indexed plain text documents

 JUGHEAD- searched the files stored in

 WANDEX- First Web Search Engine

Eg: Yahoo search engine, open

It specializes in linking to other websites and categorizing

It lists websites by category and subcategory. These website

The categorization is based on whole website rather than one

You submit a short description to the directory for your entire

Eg :- Google search engine

It is the process of scanning websites to add new

Crawlers are given an initial set URL’s whose

Crawler Control module decides which pages to

The index sometimes called “catalog” is a giant book like

Records the URLS into the LOOK

Points to pages covered in crawling

Today , it is extremely common for both type of

Usually, a hybrid search engine will favor one type

Eg:- MSN Search Engine

They do not have their own databases but have

OTHER ENGINES :13%

 Yahoo has been one of oldest powerhouses known on internet.

Results Fetching Speed

Keyword used: Silicon Valley

The fastest Search engine therefore no doubt is Google.

BETTER SEARCHING FEATURES ARE FOUND IN YAHOO

When it’s cached page was analyzed it gave the message

When it’s cached page was analyzed it gave the message

You might also like