You are on page 1of 54

UNIT 1

BROWSING THE INTERNET


" Technology will never replaced great
teachers. Technology in the hands of a
great teacher can be transformational "
What Does Browsing Mean?

Bowsing is the act of looking through a set of


information quickly, without a specific sense of
purpose. In the context of the internet, it usually
refers to using the world wide web. The term may
imply a sense of aimlessness, with the user just
wasting time on the internet.
A. What is a Search Engine?
A search engine finds information online
by searching for words that the inquirer
has typed in.
It is a computer program, i.e., a piece of
software that helps us fine the information
we are looking for on the internet using
keywords or key phrases.
According to searcmetrics: ‘A search
engine is a website through which users
can search internet content.”
What is a Search Engine?
Using a search engine Let’s imagine I
am going to fly to Bangkok Thailand from
Iloilo Philippnes this afternoon.
Iwant to find out what the weather is like
in Bangkok.
I go to Yahoo.com, a search engine, and
place the keywords “Weather Bangkok”
in the search box.
Yahoo finds and shows me a list of
websites that have the information I
What is a Search Engine?
When a user types words into the • A search service may also
search engine, it looks for web pages include a portal with news,
with those words. There could be games, and more information
thousands, or even millions, of web
besides a search
pages with those words. So, the
engine. Yahoo! has a popular
search engine helps users by putting
the web pages it thinks the user portal, while Google has a
wants first. simple design on its front
Search engines are very useful to find page. Search services usually
information about anything quickly work without charging money
and easily. Using more keywords or for finding sites, and are often
different keywords improves the supported with text or banner
results of searches. advertisements.
What is Search Engine?
Search Engine Optimization (SEO)
The website that appear on the first
Google search results page get the
most visitors.
If yours appears on the first page, It has
excellent SEO or search engine
optimization.
Part of SEO is making sure you have
the right keywords in your text so that
search engines can determine what you
focus is.
• Search engines use robots to ‘crawl’
online content. The process of crawling is
the first measure that search engines take
before indexing content in virtually any
form–videos, text, images, webpages, etc.
The content may constitute newly
uploaded content to the internet or content
that features updates or changes to its
material. These robots, also known as
crawlers or bots, record the information
along with its links. Once the material has
been crawled, it can be stored in a
massive URL database. It’s this database
that generates internet search results.
Crawling
The discovery process in which
search engines send out a team
of robots (known as crawlers or
spiders) to find new and updated
content. Content can vary — it
could be a webpage, an image, a
video, a PDF, etc. — but
regardless of the format, content
is discovered by links.
Indexing
• After the bots crawl content, it can be indexed in
the database and arranged in terms of its
relevance. If internet content has not been
crawled or indexed, it is unlikely to appear in the
search results when someone makes a query no
matter how relevant that content may be. After
the content has been crawled, each of its words
is indexed. The search engines also pinpoint
where words are located on the crawled pages.
During the indexing process, the search engine
compares the content to other content with
similar ‘words’ and decides how to organize it
within its index.
Ranking
• Ranking is a complex process that is • Google and other search engines rely on
dependent on search engine algorithms. algorithms to interpret the searcher’s
When a searcher makes a query on query, identify the websites and pages in
Google looking for anything from 19th- its index that are related to the request,
century British landscape painters to New and it then ranks them in terms of
York City plumbers, the search engine will relevance in its presented search results
generate a list of good matches to that list. What’s important to search engines is
query. How these matches appear in the to provide searchers with the most relevant
list relates to their rank. The search engine matches to their queries possible. Website
lists what it ‘thinks’ are the best answers to operators, in turn, use search engine
the query early in its search results optimization to give their pages a higher
rank.
B. BOOKMARKS
What is a bookmark/favourite? A HOW DO I FIND BOOKMARKS ON
bookmark is a web browser feature THE INTERNET?
used to save a web site's URL Find a Bookmark
address for future reference. 1. On your computer, Open Chrome.
Bookmarks save user and browser 2. In the address bar, enter
@bookmarks.
time, which is especially useful for
3. Press tab or space, You can also
Web pages with long URLs or
click search bookmarks.
accessing a specific part of the site
4. Enter keywords for the bookmark
that might not be the homepage for that you want.
the site. 5. select your bookmark from the list.
CRITERIA FOR EVALUATING WEB RESOURCES

There are six (6) criteria For each criterion, there are
several questions to be
that should be applied
asked. The more questions
when evaluating any you can answer "yes", the
Web site: authority, more likely the Web site is
accuracy, objectivity, one of quality.
currency, coverage, and
appearance.
CRITERIA FOR EVALUATING WEB RESOURCES

1.Authority
a.Is it clear who is c.There any indication of the
responsible for the contents authhor's qualifications for
of the page? writing on a particular topic?

b.Is there a way of verifying d.Is the information from


the legitimacy of the sources known to be
organization, group, reliable?
company or individual?
C. CRITERIA FOR EVALUATING WEB
RESOURCES
a. Is it clear who is WHY IMPORTANT? - It is critical to
relate the ideas you find at a site to a
responsible for the contents particular author, organization, or
of the page? business. In this way, there is a
degree of accountability for any of the
ideas expressed. Once the individual
or organization responsible for the
content is known, you can then begin
to look at other clues to help you
ascertain credibilities, such as
credentials and reputation. Be
especially wary of sites in which the
author or sponsoring organization is
not clearly stated.
CRITERIA FOR EVALUATING WEB RESOURCES

b. Is the information from sources known to be


reliable?
WHY IMPORTANT? - Statements from established and
reputable organizations almost always have been seen
and approved by several people. As a result, this check
and balance system helps prevent the release of
unsound information. Government sites (.gov) are very
good examples of organizations where information is
disseminated through this type of system.
CRITERIA FOR EVALUATING WEB RESOURCES

2. Accuracy Is the information free of grammatical,


spelling, and other typographical
Are the sources for factual information clearly
listed so they can be verified in another errors?
source?
WHY IMPORTANT? - A source of information WHY IMPORTANT? - Such errors not
is known to be scholarly when it provides
only indicate a lack of attention and
references to the information presented. In
this way, the reader can confirm whether the effort, but also can actually produce
information is accurate or the author's inaccuracies in information. Whether
conclusions reasonable. A page without the errors come from carelessness or
references still may be useful as an example ignorance, they both put the
of the ideas of an individual, organization, or
information or writer in an unfavorable
business, but not as source of factual
information. light.
CRITERIA FOR EVALUATING WEB RESOURCES

3. Objectivity Directly related to bias is the


concept of fairness. Good
a. Does the content appear to contain information sources will use a
any evidence of bias? calm, reasoned tone to present
information in a balanced manner.
WHY IMPORTANT? - If the content Pay attention to the tone and be
contains bias, only one point of view is cautious of sites that contain
being presented. This may not be bad highly emotional writing. Writing
depending on your needs. that is overly critical, attacking, or
spiteful often indicates an irrational
and unfair presentation rather than
a reasoned argument.
CRITERIA FOR EVALUATING WEB RESOURCES

b. Is there a link to a page describing the goals or


purpose of the sponsoring organization or company?
WHY IMPORTANT? - The goals or purpose of a
group, organization, or company can help you assess
for possible biasness. For example, let's say you
found an article in the online newspaper-The Truth at
Last-stating how black slaves enjoyed the idea of
slavery.
Thus, the article you read could be suspect based on
CRITERIA FOR EVALUATING WEB RESOURCES

c. If there is any advertising on the page, is it clearly


differentiated from the informational content?
WHY IMPORTANT? - The intent of advertising is to sell a product
or idea. Sometimes advertising is woven into an article, where it
is hard to notice that the information presented is actually part of
an advertisement. An example in the print world would be a
multi-page, special advertising insert in Newsweek, paid for by a
leading group of pharmaceutical companies that discusses new
developments in drug treatments for arthritis. Although the
article is very informative, its intent is to promote the products of
particular companies.
Follow me...
Inhale- 5 secs.
Exhale
Repeat 3x...

Now.....
Please Stand up and stretch
your body at least 20 secs.
CRITERIA FOR EVALUATING WEB RESOURCES

WHY IMPORTANT? - Some


4. Currency information is very time sensitive. For
example, a page talking about the top
a. Are there dates on the rate Web search engines in 1997 is
page to indicate when the going to be horribly out of date in
2000. There have been incredible
page was written, when the changes in search engine technology
page was first placed on the and new developments appear almost
Web, or when the page was monthly. However, a page discussing
the Civil War is likely still relevant
last revised? today even if the page was created in
1996 and has not been updated.
Regardless, a site should always
provide some indication of when the
information was created or the site
CRITERIA FOR EVALUATING WEB RESOURCES

5. Coverage WHY IMPORTANT? - Coverage is one of the most


Are these topics successfully important factors to consider before using the
addressed, with clearly presented information on a Web page. If the information
arguments and adequate support to appears one-sided, it could be evidence of bias.
substantiate them?
Also, you will want to see if a page is presenting a
new perspective on the topic, or just summarizing
Does the work update other
other sources. If it summarizes other sources, you
sources, substantiate other
materials you have read, or add will likely want to get hold of the originals. If it is
new information? difficult to assess the topics covered in a page or
the arguments are not presented very clearly, you
Is the target audience identified and might reconsider before referencing this site.
appropriate for your needs? Finally, be aware of the target audience to whom a
page is directed. The target audience has a direct
bearing on the coverage of a site.
CRITERIA FOR EVALUATING WEB RESOURCES

6. Appearance WHY IMPORTANT - In the print world, one way


of assessing quality in a book is through its
physical layout and appearance: the sturdiness
a.Does the site look of the binding and cover material, the presence
of a well-organized table of contents and a
well organized? comprehensive index, clear typeface, appropriate
illustrations, etc. This attention to detail reflects
an inherent quality. Likewise, in the Web
b.Do the links work? environment, a sign of quality in a site is external
links that work properly, an organizational
structure that allows one to quickly determine the
c. Does the site appear content and access it equally fast, and graphics
well maintained? or multimedia that complement the information
presented.
COMMON COMMANDS FOR SEARCH ENGINE

What to Know 33 advanced search engine commands to


perform a targeted search, filtering and
Enter define:term for a definition; saving time
use OR between terms to find
If you’re like us, use Google or Bing several
either; use quotes (") to find an times a day – it’s great. But .. happened you
exact match. entered a keyword and did not find what you
wanted? Maybe you searched for a
Don't use a space after the something more specific topic? To optimize
colon, and feel free to join your search and to save you time, you will
multiple commands in a single know how it is possible to search using
advanced search engine commands, such as
search. weather, local time, currency translation, and
Multiple commands mean fewer a variety of methods.

results, which can be good or


COMMON COMMANDS FOR SEARCH ENGINE
How does it work? Quite 1. Search Exact Phrase
simply using the search Use quotation marks (quotes) to search
box on the homepage for an exact phrase, including the order of
usual, just like you the words. For example: “Search on
normally use, you can Google”, you will find information with this
add symbols, numbers, exact phrase.
letters and words in order
to optimize your search. 2. Revocation of keyword
Need to filter any search word from
appearing with keywords? For example:
Search women’s shoes -men – Result:
Search for information that women’s shoes
and not men’s shoes.
COMMON COMMANDS FOR SEARCH ENGINE

3. Search text with 4. Search one or more


missing words keywords
If you do not remember a If you want to find
comprehensive information
word, or you want to find
about your keywords or both
phrases that used by the using the OR command. For
internet, use an asterisk example, iPhone 7 OR 6,
*, for example, “Microsoft will give you information on
* LinkedIn”. the iPhone or iPhone 6 or 7
or both.
COMMON COMMANDS FOR SEARCH ENGINE

5. Internal Search 6. Search words in the title


To locate information within If you are looking for
a specific Web site, you information or an article, but
you want the words to appear
must type in: site:
in the title of the link, use
example.com + keyword. allintitle: + words themselves
For example, You are (or intitle:). For example:
looking to find information intitle: Google Bing. The result
about Shoes on Amazon, will give you all the sites that
type site:amazon.com appear this words in the title of
shoes. the site.
COMMON COMMANDS FOR SEARCH ENGINE

7. Search Words in URL 8. Search words in text


If you want to look for words on If you are trying to find specific
the link itself, use the allinurl: + words that appear in the text,
keyword (or inurl :). For example you must use the command
search: inurl:analytics will give allintext: + words (or intext :), for
you the result of all the sites example, Search for, intext:
appears the word analytics in Google search engines
the link. Facebook. You get the sites
where there are the words:
Google, Facebook, engines,
Search.
COMMON COMMANDS FOR SEARCH ENGINE

9. Advanced Integration 10. Similar Sites


You can combine multiple If you would like to search
commands together to for similar websites based
get the desired results. on the content you need to
For example, allinurl: use the command related:
“search engines” - + site, for example:
related: facebook.com give
keyword.
you the content of sites
similar to Facebook.
COMMON COMMANDS FOR SEARCH ENGINE

11. Backlinks 12. Search for information


by file type
Look backlinks to any For more information about any
website with a link: content at sites such as PDF,
Excel, Word, etc., you can use the
function Example: command: filetype: + file type and
link: search word. For example,
www.google.com. filetype: pdf twitter 2016. You will
get information such as financial
statements of Twitter and so on.
COMMON COMMANDS FOR SEARCH ENGINE

13. The range of 14. Results by


numbers location
If you want to seek If you are looking for
information from a range
pages by geographic
of numbers added 2
points (..) the sequence
location use loc:
of numbers, it can be command. For
price, distance and so example, loc:Moskow
on. pub.
COMMON COMMANDS FOR SEARCH ENGINE

15. Synonyms 16. Information


Look for information on the
Use the tilde before the
site as a function info: + link.
word search for
information also
synonymous search 17. Weather
terms. For example, Weather search directly in
~search, also gives Google used the word
information on looking Weather + city. For example,
for. Weather London.
COMMON COMMANDS FOR SEARCH ENGINE
18. IP address
20. Cache
Type IP Address finding your IP
address. This cache that Google
scanned version of the site
19. Stocks recently. Use cache
Want to know what Google’s stock command: For example,
price? Type a stock and its cache:yahoo.com.
symbol. For example, Facebook
Type Stock FB, (sometimes
without stock works).
COMMON COMMANDS FOR SEARCH ENGINE

21. Calculator 22. Currency


Want to know how it 458 * Flying to Europe? Need to
850? May take other actions know the rate of exchange in
such as addition, any country, what is the rate
subtraction, division, etc.? of the euro. To know what
Type a search engine exchange rate you type in the
directly from the equation, or search box dollar or euro. If
simply type calculator. For you would like to use the table
example: a calculator or 22- of exchange rates, usually
15> Result: 7. works on google.com, type
currency.
COMMON COMMANDS FOR SEARCH ENGINE

23. Translation 24. Glossary


Use Google Translate? It’s nice, To get an interpretation of a
but you can translate words word? You must use the
directly from the Google search word Define + the word
box. There are several methods itself. And you will get a brief
to do this, a preferred method is: explanation. For example,
([word that wants to translate]
define frequency – will get
translated into [language you
an explanation of the word.
want]) – for example, the search:
translate school to Russian >>
The result: Школа.
COMMON COMMANDS FOR SEARCH ENGINE

25. World time 26. Conversions


Need to contact someone How many kilometers is one
from another country? Want mile? How many seconds in
to know what time it is now one year and more .. You
in any country? Quite simply, can use words in – to, for
type in the search box: time example, Type: kg to lbs or
+ state / city / location unit converter, calculator and
specific. For example, time you will get a conversion.
in new york.
COMMON COMMANDS FOR SEARCH ENGINE

27. Food comparison 29. Timer / Stopwatch


Want to compare an apple and an
orange? Use the command vs the
Type a timer or
search words. stopwatch to get started.
Usually, works on
28. Flights google.com in English.
Checking the status of your flight,
you can register your flight number
and you can search for information.
For example, Dal163.
COMMON COMMANDS FOR SEARCH ENGINE

30. Sunrise / Sunset 32. Colors


Type a sunrise or sunset, and Do you have a specific color
location and you get when the code? Type it in, and you can
sun rises or sets. change colors directly from the
31. Sports Google search. For example:
#000000 or RGB (0, 0, 0).
Want to learn, play, result,
location, League of a football
game? Just type Manchester 33. Bonus
United F.c. and you will get the Want to enter the first search
information. result? Type keywords and click
“I’m Feeling Lucky”.
WEB CRAWLER’S

A Web crawler, sometimes called a


spider or spiderbot and often shortened
to crawler, is an Internet bot that
systematically browses the World Wide
Web and that is typically operated by
search engines for the purpose of Web
indexing (web spidering).[1]
WEB CRAWLER’S
A Web crawler starts with a list of URLs from the frontier are recursively
URLs to visit. Those first URLs are visited according to a set of policies. If
called the seeds. As the crawler visits the crawler is performing archiving of
these URLs, by communicating with websites (or web archiving), it copies
web servers that respond to those and saves the information as it goes.
URLs, it identifies all the hyperlinks in The archives are usually stored in
the retrieved web pages and adds such a way they can be viewed, read
them to the list of URLs to visit, called and navigated as if they were on the
the crawl frontier. live web, but are preserved as
'snapshots'.[5]
WEB CRAWLER’S
• The large volume implies Crawling policy
The behavior of a Web crawler is the outcome
the crawler can only of a combination of policies:[7]
download a limited number
a selection policy which states the pages to
of the Web pages within a download,
given time, so it needs to a re-visit policy which states when to check
for changes to the pages,
prioritize its downloads. a politeness policy that states how to avoid
The high rate of change overloading Web sites.
a parallelization policy that states how to
can imply the pages might coordinate distributed web crawlers.
have already been
updated or even deleted.
WEB CRAWLER’S
Selection policy Designing a good selection
This requires a metric of importance policy has an added difficulty: it
for prioritizing Web pages. The must work with partial
importance of a page is a function of
information, as the complete set
its intrinsic quality, its popularity in
terms of links or visits, and even of its of Web pages is not known
URL (the latter is the case of vertical during crawling.
search engines restricted to a single
top-level domain, or search engines
restricted to a fixed Web site).
WEB CRAWLER’S
Re-visit policy From the search engine's point
The Web has a very dynamic of view, there is a cost
nature, and crawling a fraction of associated with not detecting an
the Web can take weeks or event, and thus having an
months. By the time a Web outdated copy of a resource.
crawler has finished its crawl, The most-used cost functions
many events could have are freshness and age.
happened, including creations,
updates, and deletions.
WEB CRAWLER’S
Politeness policy Parallelization policy
A parallel crawler is a crawler that
Crawlers can retrieve data much
runs multiple processes in parallel.
quicker and in greater depth than The goal is to maximize the download
human searchers, so they can rate while minimizing the overhead
have a crippling impact on the from parallelization and to avoid
performance of a site. If a single repeated downloads of the same
crawler is performing multiple page. To avoid downloading the same
page more than once, the crawling
requests per second and/or
system requires a policy for assigning
downloading large files, a server the new URLs discovered during the
can have a hard time keeping up crawling process, as the same URL
with requests from multiple can be found by two different crawling
crawlers. processes.
THANK YOU!!!

You might also like