You are on page 1of 1

Main Challenges the Web Poses for Knowledge Discovery

The Internet is vastly complex and not easily understood, thus difficult to search.
It is nearly impossible to understand the current status of the Web, thus finding useful
information becomes an issue. The high speed of data being generated on the Web
means that it might not be possible to mine very deep into it. The vast size and
complexity of the Web also make it difficult for search engines to crawl entire pages at
once and render logical results (such as relevancy). Trying to mine the full set of data
on a single page is not feasible due to its sheer size, too many links, and other factors.

The Web changes dynamically and rapidly, thus no static index can go back in
time to reflect new content. Changes are being made by users at a rapid rate, and
therefore it is hard to have a current index of what has already been said. It might be
possible for organizers to compile an archive of some pages over time, but the problem
with this approach is that as the amount of content on such pages increases, so does
the size of the archive.

More than 99% of Web pages have never been seen by human eyes and cannot
be indexed, thus there is no efficient way to search for information within them (thus
making human-based search even more difficult). Given the sheer size of the Web,
even if a small fraction of the pages could be indexed, it would still be incomparable to
the number of pages in total. It is hard to determine what users are looking for or need.
There is no feedback or negotiation on possible relevant links that can help improve
future search results.

You might also like