You are on page 1of 18

DATA WAREHOUSE & MINING

Web Structure Mining


&
Data Mining
Application 4: Online
Shopping
Group Members:
Juhi Ramod 1020234
Shubhom Rawat 1020237
Rajit Shetty 1020245
Pratik Shrivastav 1020247
Varad Tambe 1020253
Aakash Raj 1020254

Key topics discussed in


this presentation

Web Mining
Web Structure Mining
Page Rank
HITS
Problem Solving
Data Mining Application
What is Web Mining?

To Find information patterns


from the web data

Discovering useful information


from the World-Wide Web and
its usage patterns.
WEB MINING

Web Content Mining Web usage Mining


Web Structure Mining

General access Customised


Web Page Search result
pattern tracking usage tracking
content mining
Hyperlink
mining
WEB STRUCTURE
MINING

Web Structure Mining is process of analyzing the nodes and


connection structure of a website using graph theory.
Hyperlink

Web Graph contains web pages as


nodes and hyperlinks as edges

Web Document
WEB STRUCTURE TERMINOLOGY

In-degree Out-degree

Link Directed path

Shortest path Diameter


Various Web Mining Algorithms

Page Rank
HITS
CLEVER
Page Rank

Page rank algorithm is used by Google search to rank websites


in their search engine results

Way of measuring the importance of website pages

Calculated based on the number of pages that point to it,


value of the page rank is the probability between 0 and 1
Algorithm
Sum Solving:
HITS

Hyperlink - Induced Topic search

HITS also known as Hubs & Authorities developed by Jon Kleinberg is a


link analysis algorithm that rates web pages

Based on mutually recursive facts : Hubs point to lots of authorities &


Authorities are pointed by lots of hubs.

Algorithm
Sum Solving
CLEVER

Helps synthesis the information contained in a large number of


hyperlinks on the web

CLEVER algorithm is modification of standard original HITS algorithm

Distinguished itself from conventional search engine by analyzing how


documents on the internet are linked.
DATA MINING APPLICATION
ONLINE APPLICATION
Do you have
any questions?

You might also like