You are on page 1of 25

Search Engines

WHAT ARE SEARCH


ENGINES?
• A web search engine is a software system that
is designed to search for information on the
World Wide Web.
• It uses the keywords to search for documents
that relate to these key words and then puts the
result in order of relevance to the topic that
was searched for.
• search engine is a complex algorithm
[computerized program]
SOME EXAMPLES OF SEARCH ENGINES
IMPORTANCE
• Search engines are important because with
over 8 billion web pages available, it would be
impossible to search for the information that is
specifically needed.
• This is why search engines are used to filter
the information that is on the internet and
transform it into results that each individual
can easily access and use within the matter of
seconds
http://www.worldwidewebsize.com
Purpose of Search Engines
Helping people find what they’re looking for
• Starts with an “information need”
• Convert to a query
• Gets results
TYPES OF SEARCH ENGINES

• Crawler Based
• Directories
• Hybrid Search Engines
• Meta Search Engines
CRAWLER BASED SEARCH ENGINES
•These types of search engines use a "spider" or a
crawler to search the Internet. The crawler digs
through individual web pages, pulls out keywords
and then adds the pages to the search engine's
database.
•Google and Yahoo are examples of crawler
search engines.
•Crawler-based search engines are good when you
have a specific search topic.
What is a Spider or a bot?
• A spider is a program that visits
Web sites and reads their pages
and other information in order
to create entries for a search
engine index. The major search
engines on the Web all have
such a program, which is also
known as a "crawler" or a "bot.“
• A bot (short for "robot") is an
automated program that runs
over the Internet. Somebots run
automatically, while others only
execute commands when they
receive specific input.
DIRECTORIES
• Directories depend on human editors to create
their listings or the database. Yahoo Directory,
Open Directory and Look Smart are few
examples.
• A directory or directory is an online list or catalog
of websites. That is, it is a directory on the World
Wide Web of (all or part of) the World Wide
Web. 
• Human-powered directories are good when you
are interested in a general topic of search
Example
HYBRID SEARCH ENGINES
• Hybrid search engines are search engines that
use both crawler based searches and directory
searches to obtain their results .

• Example:- Yahoo.com- Google.com


META SEARCH ENGINES

• These transmit user-supplied keywords


simultaneously to several individual search
engines to actually carry out the search.
• Search results returned from all the search
engines can be integrated, duplicates can be
eliminated and additional features such as
clustering by subjects within the search results
can be implemented by meta-search engines.
• Example: Dogpile, Metacrawler
Process
• Bots and spiders find new
websites and pages by
following links added on a
website.
• Once a new page is found
the spider or bot reads the
Content & also checks for
Images.
• Everything is stored in a
huge online library called
as “Index”.
• This indexed data is
stored in encoded format
to save space.
• A user types a query on
search engine search
bar.
• The search engine goes
back to its mammoth
index library to fetch the
required information.
• the search engine found
millions of matching
information, so it uses an
algorithm to decide in
which order to display
the results.
• information is ready in less than 1 second and
displayed on a S E R P [Search Engine
Result Page]
Regularly
updated
websites with
unique content
are given better
positions on
SERP.
• user
analyzes the
search result
and reaches
a website.

officechairs.co
m
Bots give full
web pages text Term Weighting
to the indexer.
Factor
Stop words like
(for, in, at etc) Term Collection Length
and Frequency Normalization
punctuation are Frequency
ignored.

How many times Used to Long documents


The text is the term discriminate one have larger
converted to occurred in the document from term set than
lower case and collected text. the other. short ones.
stored.
It is not possible to
keep up with the
growth of web and
update the content So the web has
asap. By the time been divided into
bot is able to craw segments and then
through, its indexed the index is
content gets incrementally
outdated. updated.
Page Rank – Google’s Secret
Algorithm
User
Latest Query Conten
Reputation Terms t

Geographi
Popularit Authorit Trustworth Freshnes Relevanc Positio Siz Proximit c Web
y y y s e n e y Region History
A search engine
The algorithms identifies and
get to the corrects
deeper meaning possible
of the words spelling errors
you type in the and provides
search bar. alternatives.

Autocomplete
predicts what The previous
you might be searches help
searching for. the engine
This includes comprehend
understanding what the user
terms with more might be
than one looking for.
meaning.
A lot more goes Search engines
into displaying the like Google rank
most relevant based on more
results to the user. than 200 factors.
Search is Mostly Invisible
Like an iceberg,
Like2/3an iceberg,
below water
2/3 below water
user
interface

search
content functionality

25

You might also like