You are on page 1of 17

Data Mart

A Data Mart is a smaller, more focused Data Warehouse a mini-warehouse. A Data Mart typically reflects the business rules of a specific business unit within an

enterprise.

Data marts
A data mart is a collection of subject areas organized for decision support based on the needs of a given department. Finance has their data mart, marketing has theirs, sales has theirs and so on. And the data mart for marketing only faintly resembles anyone else's data mart.

CSP002N-week2

Characteristics
Each department has its own interpretation of what a data mart should look like and each department's data mart is peculiar to and specific to its own needs. Typically, the database design for a data mart is built around a star-join structure that is optimal for the needs of the users found in the department.
CSP002N-week2 3

Consumer products

Industrial products

Product managers

Consumer products are sold over the channel. Channels are not controlled by manufacturer. Industrial products are supplied by manufacturers.Sold in large lots(raw material,based on existing relationships and contracts)

Data Mart serves only its local community. It is modelled on information needs of that community. If information is similar data mart will be one per division.(diff info require different data marts) The goal of data mart is to answer what questionnaire end users are asking? Data warehouse is a global model(enterprise) Data mart cant handle markets at international level(many products)

Data Warehouse to Data Mart


Decision Support Information

Data Mart

Data Warehouse

Data Mart

Decision Support Information

Data Mart

Decision Support Information

Data Warehouse & Mart


Set of Tables 2 or more dimensions Designed for Aggregation

Source: adapted from Strange (1997).

Data Mining : 1-step of KDD


Knowledge

KDD = Knowledge Discovery in Databases


Patterns Data Mining

Evaluation & Presentation

Selection and Transformation Data Warehouse

Cleaning and Integration

Databases

Flat files

Contd
Data cleaning
To remove noise and inconsistent data

Data integration
Multiple data sources may be combined

Data selection
Data relevant to the analysis task are retrieved from the database

Data transformation
Data are transformed or consolidated into forms appropriate for mining by performing summary or aggregation operations
10

Contd
Data mining
An essential process where intelligent methods are applied in order to extract data patterns

Pattern evaluation
To identify the truly interesting patterns representing knowledge based on some interestingness measures

Knowledge presentation
Visualization and knowledge representation techniques are used to present the mined knowledge to the users

11

Early Steps of Data Mining


Data preprocessing
handling incomplete data, noisy data, uncertain data

Data discretization/representation
transforms data into suitable values for the mining algorithm to find patterns

Data selection
selects the suitable data for mining purposes
12

Data base Systems


Relational Data warehouse Transactional DB Advanced DB system Flat files WWW

Kinds of DB

Kinds of Knowledge

Classification Association Clustering Prediction


13

Data Mining Types of Data


Apply data mining go further
Searching for trends or data patterns
Analyzed customer data to predict credit risk of new customers based on their income Detect deviation items whose sales are far from those expected in comparison with the previous year (further investigated: change in packaging, increase in price?)

Transaction Database Similar to relational database (transactions stored in a table) Each row (record) is a transaction with id & list of items in transaction Nested relation Can be unfolded into a relational database or stored in flat files since nested relational structures did not supported by relational db system
Which items sold well together?

14

Data Mining Types of Data


Data Warehouse Stores historical data, potentially from multiple sources Organized around major subjects Contains summary statistics

Object / Object-Relational Databases


Database consisting of objects Object = set of variables + associated methods Eg: Intel uses regularity extraction in automatic circuit layout

Images
Can mine features extracted from images, OR Can use mining techniques to extract features Content based image retrieval

15

Data Mining Types of Data


Vector Geometries (spatial db)
Include GIS and CAD data Raster data n-dimensional bit maps /pixel maps Vector format point, line, polygon Can find spatial patterns between features Describing the characteristics of houses located near a specified kind of location Describe the climate of mountainous areas located at various altitudes

Text
Can be unstructured, semi-structured, or structured Documentation, newspaper articles, web sites etc. Can facilitate search by linking related documents / concepts
16

Data Mining Types of Data


Video / Audio
Speech recognition recognized spoken command Security applications Integrated with standard data mining methods (storage and searching)

Temporal Databases / Time Series


Global change databases (temperature records) Space shuttle telemetry Stock market data (stock exchange) Usually stores relational data that include time-related attributes Find the trend of changes for objects decision making/strategy planning

17

You might also like