You are on page 1of 10

Module 1 Introduction

■ Data Mining –Stages of the Data Mining Process


■ Data Mining Knowledge Representation
■ Technologies
■ Major Issues in Data Mining
■ Data Warehousing
■ Multidimensional data
■ OLAP Vs OLTP

Dr Senthilkumar N C, Asso Prof,


SITE
Database Management System
■ Definitions: ?
■ S/w to create and manipulate database
■ Example: ?
■ Booking flight ticket
■ Registering course in vtop
■ Booking hotels
■ Trading (share market)

■ Transactions:
■ OLTP : OnLine Transaction
Processing

Dr Senthilkumar N C, Asso Prof,


SITE
Why Data Mining?

■ Explosive growth of data


■ Data from
■ Business
■ Science
■ Education
■ Society

We drowning in data but starving for knowledge

Dr Senthilkumar N C, Asso Prof,


SITE
Science Data

Dr Senthilkumar N C, Asso Prof,


SITE
Business data

Customer Bought in Tatacliq.

Dr Senthilkumar N C, Asso Prof,


SITE
Data Mining
Data mining:

Extracting or mining knowledge from huge amounts of data.

Alternative names:

knowledge mining from data,

knowledge extraction,

data/pattern analysis,

data archeology,

data dredging,

Knowledge
Discovery
(mining) from Dr Senthilkumar N C, Asso Prof,
SITE
Data (KDD).
Knowledge Discovery (KDD) Process

Pattern Evaluation

Data Mining

Task-relevant Data

Data Warehouse Selection and


Transformatio
n
Data Cleaning

Data Integration

Databases
Dr Senthilkumar N C, Asso Prof,
SITE
Architecture: Typical Data Mining System

Graphical User Interface


Knowledge
Pattern Evaluation Base

Data Mining Engine

Database or Data Warehouse


Server

data cleaning, integration, and


selection

Data World- Other Info


Database Repositories
Warehous Wide
e Web
Dr Senthilkumar N C, Asso Prof,
SITE
Data Mining: On What Kinds of Data?
■ Database-oriented data sets and applications

Relational database, data warehouse, transactional
database
■ Advanced data sets and advanced applications

Object-relational databases

Temporal data, Sequence data, Time-series data,

Spatial data and spatiotemporal data

Text databases and Multimedia database

Heterogeneous databases and legacy databases

Data streams and sensor data

The World-Wide Web
Dr Senthilkumar N C, Asso Prof,
SITE
Data mining Task
Two categories:
Descriptive:
Mining tasks characterize the general properties of the data in the
database.

Predictive:
Mining tasks perform inference on the current data in order to
make predictions.

Dr Senthilkumar N C, Asso Prof,


SITE

You might also like