You are on page 1of 41

Introduction

to DSS
&
Data Warehousing
Concepts
Atul Gandre
TATA
INFOTECH Ltd

1

Contents
• What is DSS ?
• DSS architecture and its components
•Extraction, Transformation & Loading
•Data Access & Analysis
• Data marts
• Data Mining

TATA
INFOTECH Ltd

2

What is DSS?

3

Management
Objectives

• Increased profits
• Improved margins
• Reduced overheads
• Larger market share

TATA
INFOTECH Ltd

4

Cost Breakup Capacity utilisation over time TATA INFOTECH Ltd 5 .. over the past 3 years ” • Show Sales by volume. “Maker Give peformance all TV’s. value and margin contribution • By different time periods • By product model • By region • • • • • Compare 3 years sales monthwise Compare sales v/s type of promotion.Typical Questions a Decision may askof.. by region Best distributors / Worst distributors Margin Analysis.

.. “ Best Products / Worst Products “ by • • • • • Sales Value Volume Profit / Margins Market Share % growth TATA INFOTECH Ltd 6 .Typical Questions a Decision Maker may ask .

..Typical Questions a Decision Maker may ask . “ Top 10 / Bottom 10 Sales men this year “ by • • • • • Sales value % of target met % over last year By region By product TATA INFOTECH Ltd 7 .

“ Show the Top 20 Customers “ • • • • • By month / quarter / year By product / product group / all products For a region / all regions Ranking over the last 12 months Recovery of dues TATA INFOTECH Ltd 8 ...Typical Questions a Decision Maker may ask .

Summarisation. Running totals. Cumulations.Can Transaction systems answer such The problems : ? • queries Highly normalised structures make queries more complex • Increase in complexity of queries due to :   Aggregation. Comparison Various dimensions • Little historical data stored on-line for comparison • High resource utilisation will result in slow response to complex queries • Difficulty in making ad-hoc queries TATA INFOTECH Ltd 9 . Ranking.

These systems are designed not to replace managerial judgement but to support it and make the decisions more effective.DSS .W H Inmon TATA INFOTECH Ltd 10 .a definition Decision Support Systems use computers to facilitate the decision making process of semi structured tasks. DSS helps managers react quickly to changing needs. .

DSS .W H Inmon 11 .a paradigm for analysis Transactional Applications DSS Applications • • • • • • • • • Operational • Run the Business • Long development cycles • Detailed data • No redundancy • Data is normally updated Amount of data used in a process • is small Serves the clerical community • TATA INFOTECH Ltd Analytical Gather strategic information Constant prototype mode Detailed and summarized data Redundancy allowed Data is normally loaded Amount of data used in a process is large Serves the managerial community .

transformation and loading • Data access and analysis.DSS architecture and its components • Data warehouse architecture • Data extraction. TATA INFOTECH Ltd 12 .

& Transformation OLAP AI Data Sources (Operational Systems) TATA INFOTECH Ltd Data Warehouse Data Access 13 .DSS Architecture External Sources Data Mining EIS OR Data Extraction. Scrubbing.

... • The data in the Data Warehouse is accessed by the front-end tools TATA INFOTECH Ltd 14 .DSS Architecture The DSS Architecture consists of.  Extraction of data from various operational systems on different platforms. then transforming and loading to the Data Warehouse • The Data Warehouse contains historical data as well as current data...

Data Warehouse .the heart of DSS “ The Data Warehouse is that portion of an overall architected data environment that serves as the single integrated source of data for decision support systems ” .W H Inmon TATA INFOTECH Ltd 15 .

W H Inmon TATA INFOTECH Ltd 16 . • integrated.Data Warehouse : Another definition “A Data Warehouse is a • subject-oriented.” . • time variant and • non-volatile collection of data in support of management’s decision-making process.

… Characteristics of a Data Warehouse • The DW provides access to corporate / organizational data • The data in the DW is consistent • The data in the DW can be separated and combined by means of every possible measure in the business • The DW is where data is “published” TATA INFOTECH Ltd 17 .

DW Data Model Characteristics • • • • • • Data centric not process based Simple to understand Flexible to add/modify Design reflects business information Query driven design Denormalised Intuitive and easy to use .

The Dimensional Model • The CEO of a Company says • “We sell products in various markets and we measure our performance over time” Time Market Product .

The Dimensional Model • Each Cell in the cube contains business measures for a particular combination of • Product. Market and Time • Other Names • Star Join Schema. Multidimensional model . Star Schema.

DW Data modelling Multidimensionality : An example 2Q95 3Q95 4Q95 1Q96 2Q96 Ruby Emerald Saffire East West North SalesRevenue • North • Emerald • 1Q96 .

DW Data modelling Star Schema Fact tables Time Sales Product Dimension tables Region Customer .

Star Schema Features • One large central table called Fact Table • A number of attendant tables having a single join attaching them to the Fact Table called Dimension Tables • A Time Dimension • Fact Table Primary key is a composite key of foreign keys .

Snowflake Design • Snowflake refers to normalising dimension tables • Creating Outrigger tables containing • containing descriptions of codes in dimension table • containing additional attributes .

.Snowflake Schema Example Supplier Store District Location Region Product Sales Fact Sales Assoc. Month Day Season Time Seller Sales Dept.

End users and source data • Identify a subject area • Find out the ‘facts’ • Associate the facts with the business dimensions • Define the attributes in the dimensions • Decide on the level of detail .the granularity • Decide the ‘summarise and purge’ period .Data Modelling Steps Starting point .

transformation and upload Operational Systems Data Extraction TATA INFOTECH Ltd Data Warehouse EIS Data Upload Data Transformation 27 .Data extraction.

Data Extraction The Extract program • • • • Rummages through a file or database Uses some criteria for selection Identifies qualified data and Transports the data over onto another file or database. TATA INFOTECH Ltd 28 .

.Data Extraction Cleanup • Restructuring of records or fields • Removal of Operational-only data • Supply of missing field values • Data Integrity checks • Data Consistency and Range checks. etc. TATA INFOTECH Ltd 29 ..

Data transformation • Integrating dissimilar data types • Changing codes • Adding a time attribute • Summarising data • Calculating derived values • Denormalising data TATA INFOTECH Ltd 30 .

Data loading • • • • Initial and incremental loading Updation of metadata Updation of log Rollback in case of loading errors 31 .

ETL Tools • • • • Informatica Ardent DataStage Oracle Warehouse Builder Microsoft Data Transformation Service(DTS) 32 .

Data Access & Analysis • Ease of navigation across screen • Value addition by better information presentation ( graphs. charts and maps) • Highlighting exception information by Alarms and Alerts • Drill-down / roll-up through successive levels of data • What -if analysis TATA INFOTECH Ltd 33 .

Reporting Tools • • • • • Microstrategy Business Objects Cognos Brio Hyperion 34 .

What is OLAP? OLAP is an On-line Analytical processing technology which creates new business information from existing data . TATA INFOTECH Ltd 35 . through a rich set of business transformations and numerical calculations.

OLAP Characteristics • Always involves interactive query and analysis of the data. • Offers analytical modelling capabilities TATA INFOTECH Ltd 36 . The interaction is usually multiple passes • Involves drilling down into successively lower levels of detail data • Involves roll-ups to higher levels of summarization and aggregation.

Data marts • Architecture • Characteristics • Example 37 .

Scrubbing.Data mart in the DSS Archite External Sources Data Mining EIS OR Data Extraction. & Transformation OLAP AI Data Sources (Operational Systems) TATA INFOTECH Ltd Data Warehouse Data Mart Data Access 38 .

more focussed level • Data marts are focussed at departmental users • Decentralised approach TATA INFOTECH Ltd 39 .What are Data marts ? • Data marts are scaled-down and less expensive versions of data warehouses • Data marts utilize large-scale data warehousing concepts on a smaller.

TATA INFOTECH Ltd 40 . It is a powerful new technology with great potential to help companies focus on the most important information in the data warehouse.What is data mining? Data Mining. the extraction of hidden information from large databases.

Data mining capabilities • Discovery of unknown patterns • Prediction of trends and behaviors • Discovery of anomalies in data TATA INFOTECH Ltd 41 .