You are on page 1of 42

Prepared By:

Hetal Dodia (8)


Asif kureshi (17)
Tejas Patel (27)
Nidhi Trivedi (37)

1
 Internet
 Searching
 Search Engine
 History
 Examples
 Types Of Search Engine
 How It Works.

2
Internet
 Internet
 An interconnected network of thousands of networks and
millions of computers linking businesses, educational
institutions, government agencies, and individuals together
Searching.
 A lot of information makes a site huge, complex and
navigation difficult.
 Search is the user's lifeline for mastering complex websites.
 Search feature is essential for users when they revisit a site,
looking for specific info.

4
Types of Searching
 A search can be of various types:
 Internet Search: Search Engines like Yahoo, Info seek crawl
the web gathering web pages or info on web pages, index them
and retrieve them when the specific term is found
 Database search: Databases store their information neatly
organized into fields. A search Interface is provided for this.

5
SEARCH ENGINE
 “A tool designed to search for information on the World Wide Web. The
information may consist of web pages, images, information and other
types of files.”
 A search engine is an information retrieval system
designed to help find information stored on a
computer system.
 The search results are usually presented in a list and
are commonly called hits.

6
 Every ordinary user on the Internet must have good knowledge
about search engines and searching in order to explore the
wonderful world that the Internet creates to a greater extent.
 Search engines help to minimize the time required to find
information and the amount of information which must be
consulted
 Searching is one of the most used action on the Internet.
 Search engines as an instrument of searching, are special sites on
the Web that are designed to help people find information stored
on other sites.

 Includes external engines like Google, Yahoo, MSN, AOL, Live.


7
History of search engine
 A list of web servers.. New servers were announced under title “What’s
new”
 The first tool for searching – Archie
 Then rise of Gopher led to 2 new search programs – Veronica and
Jughead.(1991)
 Till 1993, no search engine existed for the web.
 Web’s first primitive search engine – W3catalog.(1993)

8
History Cont…..
 First “all text” crawler based search engine – WEBCRAWLER (1994)
 Google adopted idea of selling search terms in 1998, from small
company named goto.com
 Brightest stars in the internet investing frenzy.
 Google rose to prominence (2000)
 Microsoft’s first SE MSN was using search results from Inktomi

9
History Cont……
 Microsoft rebranded SE, Bing launched on June 1 2009.
 on July 29, A deal between Yahoo and Bing.
 In 2012, Google released the Beta version of Open Drive- available as
a chrome application.

10
Market of Search Engines

11
The Best & Most popular Search engine

12
4th Most visited website in the world

13
Ask Question Search Engine

14
MicroSoft Bing Search Engine

15
Types Of Search Engines
1. Crawler-Based Search Engines
2. Human-Powered Directories

3. "Hybrid Search Engines" or Mixed Results

4. Meta Search Engine

16
Crawler-Based Search Engines

 Crawler-based search engines, such as Google, create their listings


automatically. They "crawl" or "spider" the web, then people search
through what they have found.

 If you change your web pages, crawler-based search engines


eventually find these changes, and that can affect how you are
listed. Page titles, body copy and other elements all play a role.

17
Cont….
 Crawler-based search engines are good when you have a specific
search topic in mind and can be very efficient in finding relevant
information in this situation

 LIKE….. Google, AllTheWeb and AltaVista

18
Human-Powered Directories
 A human-powered directory, such as the Open Directory, depends on
humans for its listings.
 You submit a short description to the directory for your entire site, or
editors write one for sites they review. A search looks for matches only in
the descriptions submitted.
 Changing your web pages has no effect on your listing. Things that are
useful for improving a listing with a search engine have nothing to do with
improving a listing in a directory.
 The only exception is that a good site, with good content, might be more
likely to get reviewed for free than a poor site

19
Cont..
 Human-powered directories are good when you are interested in a
general topic of search.
 In this situation, a directory can guide and help you narrow your
search and get refined results.
 Therefore, search results found in a human-powered directory are
usually more relevant to the search topic and more accurate.
 However, this is not an efficient way to find information when a
specific search topic is in mind.
 Example- Yahoo directory, Open Directory and LookSmart

20
Pros of Human-Powered Directories
 Fast answers (sometimes)

 Answers sent directly to your phone or email. This is especially


beneficial if you are on the go, and using a service such
as ChaCha or KGB, that allows you to ask and answer your question
via text message.

 Sometimes standard search engines don't know what you're talking


about- and that's where dealing with an actual human helps.

21
Cons of Human-Powered Directories
 Lengthy search time: Having to wait for, what may seem like forever,
before receiving an answer.

 Unanswered questions: While some sites may take days, other sites may
not even have an answer for your question

 Human Error: We all know and trust Google to deliver our answers, but
we have no idea who is answering our questions on human powered sites,
and what their qualifications are. Would you trust just anyone? Because I
certainly don't

 Annoying Categorization: Many sites ask you to categorize, sub-


categorize, and sub-subcategorize your questions- which takes the
simplicity out of these human powered search engines.

22
"Hybrid Search Engines" or Mixed Results
 Hybrid search engines use a combination of both crawler-based
results and directory results. More and more search engines these
days are moving to a hybrid-based model.

 It extremely common for both types of results to be presented.


Usually, a hybrid search engine will favor one type of listings over
another.

 For example, MSN Search is more likely to present human-powered


listings from Look Smart.
 Example-Yahoo ,Google

23
Meta Search Engine
 Transmit user-supplied keywords simultaneously to several
individual search engines to actually carry out the search.

 Search results returned from all the search engines can be


integrated, duplicates can be eliminated and additional features
such as clustering by subjects within the search results can be
implemented by meta-search engines.

24
Meta Search Engine Cont…..
 Meta-search engines are good for saving time by searching only in
one place and sparing the need to use and learn several separate
search engines.

 But since meta-search engines do not allow for input of many


search variables, their best use is to find hits on obscure items or to
see if something can be found using the Internet.

25
Pros of meta search engines
1.Searching with many primary search engines often finds results missed by
a single primary engine.

2. Requesting results from many primary engines in parallel saves time.

3. Eliminating duplicate results also saves time.

4. Getting results from many different primary engines


provides opportunities to explore how to best combine the separate result
lists
26
Cons of meta search engines
1. Timeouts or long waits may occur if the meta search engine is
having difficulty contacting the primary engine.

2. Many meta search engines only get the top 10 to 50 results per
primary engine.

3. Some advanced features (ex. phrase searching) may not be


available.

4. Many meta search engines exclude one or more of the major


primary search engines (Google, Microsoft, or Yahoo).

27
How it works
1. Index ahead of time
 Find files or records
 Open each one and read it
 Store each word in a searchable index
2. Provide search forms
 Match the query terms with words in the index
 Sort documents by relevance
3. Display results

28
29
1. Index ahead of time by spiders
 To find information on the hundreds of millions of Web pages that
exist, a search engine employs special software robots,
called spiders, to build lists of the words found on Web sites.

 A program that automatically fetches Web pages. Spiders are used


to feed pages to search engines. It's called a spider because it crawls
over the Web. Another term for these programs is web crawler.

30
Cont……
 Spiders store the lists in the engine’s database.

 The engine’s indexing software builds an index of words .

 Information is matched against query input and retrieved


(processing algorithm)

31
What the Index Needs
 Basic information for document or record
 File name / URL / record ID
 Title or equivalent
 Size, date, MIME type
 Full text of item
 More metadata
 Product name, picture ID
 Category, topic, or subject
 Other attributes, for relevance ranking and display

32
Simple Index Diagram

33
Cont…..
 Once the spiders have completed the task of finding information
on Web pages the search engine must store the information in a
way that makes it useful.

 a search engine could just store the word and the URL where it was
found. In reality, this would make for an engine of limited use,
since there would be no way of telling whether the word was used in
an important or a trivial way on the page

34
Cont…
 Ranking list that tries to present the most useful pages at the top of
the list of search results

 The engine might assign a weight to each entry, with increasing


values assigned to words as they appear near the top of the
document, in sub-headings, in links, in the meta tags or in the title
of the page

 An index has a single purpose: It allows information to be found as


quickly as possible. There are quite a few ways for an index to be
built, but one of the most effective ways is to build a hash table. In
hashing, a formula is applied to attach a numerical value to each
word.

35
2.Provide search form
 Searching through an index involves a user building a
query and submitting it through the search engine.

 The query can be quite simple, a single word at minimum. Building


a more complex query requires the use of Boolean operators that
allow you to refine and extend the terms of the search

 Boolean operators- AND, OR, NOT, FOLLOWED BY, NEAR etc.

36
Cont…..
 Most of search engines support caching to reduce the cost of time of
searching of common words like "Amazon" dramatically. If the site
received a query whose result is stored in cache, it returns the result
from the cache without any posting a query request to the main
database.

37
3. Display result
 After the search engine received the result from the main database
or cache, the site has to display the result to the user. The listing of
result is usually quite simple: just list web pages that are hit with
the description of the site. However, the order of the list is
important yet difficult to judge by pure computation.

38
Page rank
 Once the search engine has found web pages for the given
query, what ordering should the links be provided?
 Google researchers invented the page rank
 some pages are found to be more important than others and
so, if two pages match a query, order them so that the more
important page’s link comes first
 Ordering is based on the page rank which primarily looks to
see if a page is an “authoritarian page” which means that a lot
of other pages link to it

39
Cont….
 Similarly, a “hub” is a page which has a lot of outgoing links and
may represent a good starting point
 Advertising can also affect the order that pages are offered
 Advertisers will pay search engine sites to place their links before others, or in
special areas of the web page
 If you go to Google and search for computers, you get links for Dell,
Apple, Staples, and others near the top and to the right of the page –
why?
 they paid to be there !!
 Best Buy didn’t pay as much, so they are located lower down !

 This is a consequence of commercializing the web – money talks

40
Search Will Never Be Perfect
 Search engines can’t read minds
 User queries are short and ambiguous
 Some things will help
 Design a usable interface
 Show match words in context
 Keep index current and complete
 Adjust heuristic weighting
 Maintain suggestions and synonyms
 Consider faceted metadata search

41
THANK YOU

42

You might also like