Welcome to Scribd!

Main Challenges The Web Poses For Knowledge Discovery

Uploaded by

0% found this document useful (0 votes)

759 views1 page

The document discusses three main challenges of knowledge discovery on the web: 1) The web is vastly complex and difficult to search and understand its current status; 2) The web changes rapidly making it hard to have an up-to-date index and the archives continue growing in size; 3) More than 99% of web pages have never been seen or indexed, making searching and determining user needs very difficult.

Original Description:

Original Title

Main Challenges the Web Poses for Knowledge Discovery

Copyright

Available Formats

DOCX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

759 views1 page

Main Challenges The Web Poses For Knowledge Discovery

Uploaded by

Jillian Noreen

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 1

Search inside document

Main Challenges the Web Poses for Knowledge Discovery

The Internet is vastly complex and not easily understood, thus difficult to search.
It is nearly impossible to understand the current status of the Web, thus finding useful
information becomes an issue. The high speed of data being generated on the Web
means that it might not be possible to mine very deep into it. The vast size and
complexity of the Web also make it difficult for search engines to crawl entire pages at
once and render logical results (such as relevancy). Trying to mine the full set of data
on a single page is not feasible due to its sheer size, too many links, and other factors.

The Web changes dynamically and rapidly, thus no static index can go back in
time to reflect new content. Changes are being made by users at a rapid rate, and
therefore it is hard to have a current index of what has already been said. It might be
possible for organizers to compile an archive of some pages over time, but the problem
with this approach is that as the amount of content on such pages increases, so does
the size of the archive.

More than 99% of Web pages have never been seen by human eyes and cannot
be indexed, thus there is no efficient way to search for information within them (thus
making human-based search even more difficult). Given the sheer size of the Web,
even if a small fraction of the pages could be indexed, it would still be incomparable to
the number of pages in total. It is hard to determine what users are looking for or need.
There is no feedback or negotiation on possible relevant links that can help improve
future search results.

Requirements For The Future Digital Library PDF
Document4 pages
Requirements For The Future Digital Library PDF
ombisen18
No ratings yet
Masters Thesis Proposal
Document22 pages
Masters Thesis Proposal
Derick Kopp
No ratings yet
Website Requirements Document
Document4 pages
Website Requirements Document
ar_jain8654
No ratings yet
Management Information System
Document56 pages
Management Information System
Sunil Sharma
No ratings yet
SRS-Senior Project
Document37 pages
SRS-Senior Project
Raging Aunty
No ratings yet
An Introduction to Search Engines and Web Navigation
From Everand
An Introduction to Search Engines and Web Navigation
Mark Levene
No ratings yet
Employability & Professional Development
Document18 pages
Employability & Professional Development
Golam Kibria Bhuiyan
No ratings yet
Model Final Exam - Management Information System PDF
Document14 pages
Model Final Exam - Management Information System PDF
yared haftu
No ratings yet
Chapter 3 E-Business Infrastructure
Document9 pages
Chapter 3 E-Business Infrastructure
asim
No ratings yet
Week Five Assignment Database Modeling and Normalization
Document9 pages
Week Five Assignment Database Modeling and Normalization
Evans Oduor
No ratings yet
CMP 308 Formal Methods and Software Development
Document49 pages
CMP 308 Formal Methods and Software Development
muhammedmuhammadumakhsus
No ratings yet
The Role of Computer in Enhancing Banking
Document10 pages
The Role of Computer in Enhancing Banking
hibbuh
100% (1)
Design and Implementation of A Computer Based Quality Assurance Monitoring System
Document8 pages
Design and Implementation of A Computer Based Quality Assurance Monitoring System
Babs S. Japh
No ratings yet
Literature Review On E-Learning
Document6 pages
Literature Review On E-Learning
Sohaib
No ratings yet
Chapter 01 Exam
Document6 pages
Chapter 01 Exam
bin13595
No ratings yet
Activity Diagram For Online Shopping Website
Document3 pages
Activity Diagram For Online Shopping Website
Mohammed Husain
No ratings yet
Assignment Brief Employability and Professional Development
Document6 pages
Assignment Brief Employability and Professional Development
chandni0810
50% (2)
Online Education Portal
Document14 pages
Online Education Portal
jatti Mapplai Jatti
No ratings yet
Web Based Career Guidance
Document13 pages
Web Based Career Guidance
UniquaVerde
100% (2)
Staying Relevant: Public Libraries Look To Social Media To Engage Teen Patrons
Document17 pages
Staying Relevant: Public Libraries Look To Social Media To Engage Teen Patrons
sfindley1970
No ratings yet
Systems Analysis and Design
Document37 pages
Systems Analysis and Design
Rini Sartini
No ratings yet
Local Studies
Document4 pages
Local Studies
Waykeee Battousai
No ratings yet
Graphic Designer
Document2 pages
Graphic Designer
api-303050504
No ratings yet
Online Hotel Booking Database Management System
Document47 pages
Online Hotel Booking Database Management System
W.T LAU
No ratings yet
Information & Communication Technology - 1 (ICT-1) : Computer Fundamentals and Office Tools II Semester
Document49 pages
Information & Communication Technology - 1 (ICT-1) : Computer Fundamentals and Office Tools II Semester
Kumara S
No ratings yet
Curriculum Vitae: A: Personal Particulars
Document3 pages
Curriculum Vitae: A: Personal Particulars
john mwakibete
No ratings yet
SPM-Tutorial-02 (QnA) 1.18
Document3 pages
SPM-Tutorial-02 (QnA) 1.18
loojiawei97
No ratings yet
Executive Summary IWMS Business Case
Document1 page
Executive Summary IWMS Business Case
Radit Adi
No ratings yet
1.new College Portal
Document104 pages
1.new College Portal
Manish Kumar
No ratings yet
Graduation Project Online Management System ALHOSN
Document9 pages
Graduation Project Online Management System ALHOSN
Leila Meriem
No ratings yet
Online Help Desk
Document28 pages
Online Help Desk
pandoll
No ratings yet
DDA Assingment
Document26 pages
DDA Assingment
LrzFreeze
No ratings yet
Introduction To Communication and Technology Notes
Document17 pages
Introduction To Communication and Technology Notes
Ali Abdullah Jilani
100% (2)
Social Media Course Syllabus
Document9 pages
Social Media Course Syllabus
anthony_cocciolo
No ratings yet
Final Project Report
Document20 pages
Final Project Report
minhthang_hanu
No ratings yet
TSL 061 Mini Survey Proposal
Document3 pages
TSL 061 Mini Survey Proposal
Lyana Rosly
No ratings yet
ASSIGNMENT SOFTWARE DESIGN AND DEVELOPMENT Full
Document31 pages
ASSIGNMENT SOFTWARE DESIGN AND DEVELOPMENT Full
Shamir Shahriar
No ratings yet
MPEG Video Coding I - MPEG-1 and 2: 11.1 Overview 11.2 MPEG-1 11.3 MPEG-2 11.4 Further Exploration
Document20 pages
MPEG Video Coding I - MPEG-1 and 2: 11.1 Overview 11.2 MPEG-1 11.3 MPEG-2 11.4 Further Exploration
gayathri.seelam
No ratings yet
Computerized Library System 1
Document7 pages
Computerized Library System 1
Anne Palabrica
100% (1)
Discussion Questions
Document5 pages
Discussion Questions
Dayang Afiqah
No ratings yet
Assignment - Iot
Document4 pages
Assignment - Iot
api-484597568
No ratings yet
My Fyp
Document95 pages
My Fyp
Trend Era
No ratings yet
Coast Saring MS
Document71 pages
Coast Saring MS
Bethelhem Yetwale
No ratings yet
Ambo University Woliso Campus
Document10 pages
Ambo University Woliso Campus
Tolosa Tafese
No ratings yet
Case Addis Ababa
Document5 pages
Case Addis Ababa
Aeron-potpot Ramos
100% (1)
Chapter One (History and Overview)
Document36 pages
Chapter One (History and Overview)
mesfin
No ratings yet
CTS-285 Study Guide
Document95 pages
CTS-285 Study Guide
Max Potent
100% (1)
Lecture 01 - Design Issues PDF
Document18 pages
Lecture 01 - Design Issues PDF
Cheah Bluffer
No ratings yet
Software Requirement Specification
Document8 pages
Software Requirement Specification
dai_roshan
No ratings yet
Daniel
Document67 pages
Daniel
Mercy Emmanson
No ratings yet
Rwanda ICT Sector Profile-2014
Document31 pages
Rwanda ICT Sector Profile-2014
Marcellin Marca
No ratings yet
Human Computer Interaction Quiz Questions
Document9 pages
Human Computer Interaction Quiz Questions
kassahun
No ratings yet
Social Networks and Privacy
Document4 pages
Social Networks and Privacy
David Wietstruk
No ratings yet
DWH Quiz2 With Answers
Document14 pages
DWH Quiz2 With Answers
Swapna
No ratings yet
Dcap202 Fundamentals of Web Programming
Document178 pages
Dcap202 Fundamentals of Web Programming
Muhammad Yusif Abdulrahman
No ratings yet
Data Communication and Network
Document6 pages
Data Communication and Network
rise empire
No ratings yet
UNIT-I: Software Engineering & Process Models: Dual Role of Software
Document4 pages
UNIT-I: Software Engineering & Process Models: Dual Role of Software
Ansh Goel
No ratings yet
Seminar II Pdf3
Document5 pages
Seminar II Pdf3
samjhanakarki65
No ratings yet
Robust Semantic Framework For Web Search Engine
Document6 pages
Robust Semantic Framework For Web Search Engine
surendiran123
No ratings yet
Web-Page Recommendation Based On Web Usage and Domain Knowledge
Document7 pages
Web-Page Recommendation Based On Web Usage and Domain Knowledge
GR Techno Solutions
No ratings yet
Topic 3 Probalistic Models
Document5 pages
Topic 3 Probalistic Models
Jillian Noreen
No ratings yet
Philippine History Reviewer
Document3 pages
Philippine History Reviewer
Jillian Noreen
No ratings yet
Understanding The Self
Document13 pages
Understanding The Self
Jillian Noreen
No ratings yet
Module 1 Multimedia Concepts and Graphic Design
Document12 pages
Module 1 Multimedia Concepts and Graphic Design
Jillian Noreen
No ratings yet
Assignment 3 Artificial Intelligence
Document5 pages
Assignment 3 Artificial Intelligence
Jillian Noreen
No ratings yet
Chapter 6 - Security and Privacy Plan
Document2 pages
Chapter 6 - Security and Privacy Plan
Jillian Noreen
No ratings yet
Test Scenario Test Case Id Test Case Expected Result: Test The User Authentication Functionality
Document22 pages
Test Scenario Test Case Id Test Case Expected Result: Test The User Authentication Functionality
Jillian Noreen
No ratings yet
Assignment Data Science
Document3 pages
Assignment Data Science
Jillian Noreen
No ratings yet
Assignment 5 Emerging Technologies
Document4 pages
Assignment 5 Emerging Technologies
Jillian Noreen
0% (2)
Time Sharing: Operating System
Document8 pages
Time Sharing: Operating System
Jillian Noreen
No ratings yet