Professional Documents
Culture Documents
Guno
ECOMMER K31
THE DEEP WEB
Also known as:
Undernet
Invisible Web
Hidden Web
All data that does not appear in search
engines (e.g. Google, Yahoo, Bing, etc.)
Data found in search engines is
called the surface web
Only 0.03% of available information!
Deep web is estimated to be 500x
bigger than the surface web
WHAT IS THE DEEP WEB?
It cannot be found by current
search engine technology
Search engines have robots
or Web spiders index
websites through metadata
(e.g. page title, page
location (URL) and
repeated keywords used in
text
collect page data from
hyperlinked pages
Sites are dynamically
generated: data cannot be
indexed because information is
not hyperlinked
not immediately accessible
to web spiders
Information may or may not be
purposely hidden, including:
Data that needs to be accessed
by a search interface
Results of database queries
Subscription-only information
and other password-protected
data
Pages that are not linked to by
any other page
Technically limited content,
such as that
requiring CAPTCHA technology
Text content that exists outside
of conventional http:// or https://
protocols
HOW AND WHY DOES IT EXIST?
Tor or The Onion Router
makes tracking difficult by
routing connections
through servers around
the world
access websites ending in
.onion
originated from research
by the US Naval Research
Laboratory in 2003 to
protect political dissidents
and whistleblowers
HOW TO ACCESS THE DEEP WEB
Negative: Illegal activities
illicit drugs, child pornography,
stolen credit card
numbers, human trafficking,
weapons, exotic animals,
copyrighted media (How Stuff
Works)
Transactions in the deep
web are done through
Bitcoin encrypted digital
currency which maintains
anonymity when transacting
Positive
Education: finding research
papers that can help different
fields and industries, e.g.
research for diseases
Privacy: Increased privacy in e-
mail, file storage and sharing,
social media, news outlets, and
whistleblowing sites
Free speech: Can be used by
civilians to overcome
censorship online in countries
with oppressive regimes
IMPLICATIONS OF THE DEEP WEB
WHATS NEXT?
The deep web grows each day
Challenge is for programmers to improve search engine algorithms to
manage big data
Big data sets of data that are so large that they become
incoherent
Companies who learn how to manage data have competitive
advantage to survive; those who rely only on surface web will not
Make content more accessible (tips can be found here:
http://oedb.org/ilibrarian/invisible-web/)
REFERENCES
http://oedb.org/ilibrarian/invisible-web/
http://computer.howstuffworks.com/internet/basics/how-the-deep-web-
works3.htm
http://en.wikipedia.org/wiki/Deep_Web