The document outlines the key steps in a Business Intelligence and Data Warehousing (BIDW) roadmap, including:
1) Planning the project, defining business processes, designing technical architecture, selecting products, and dimensional modeling.
2) Developing ETL processes like extract, transform, load to move data to staging and target tables, and naming conventions.
3) Designing and developing BI applications with layers like data, presentation and developing reports and dashboards.
4) Deploying and maintaining the BIDW system with version control, change management and environment management.
The document outlines the key steps in a Business Intelligence and Data Warehousing (BIDW) roadmap, including:
1) Planning the project, defining business processes, designing technical architecture, selecting products, and dimensional modeling.
2) Developing ETL processes like extract, transform, load to move data to staging and target tables, and naming conventions.
3) Designing and developing BI applications with layers like data, presentation and developing reports and dashboards.
4) Deploying and maintaining the BIDW system with version control, change management and environment management.
Copyright:
Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online from Scribd
The document outlines the key steps in a Business Intelligence and Data Warehousing (BIDW) roadmap, including:
1) Planning the project, defining business processes, designing technical architecture, selecting products, and dimensional modeling.
2) Developing ETL processes like extract, transform, load to move data to staging and target tables, and naming conventions.
3) Designing and developing BI applications with layers like data, presentation and developing reports and dashboards.
4) Deploying and maintaining the BIDW system with version control, change management and environment management.
Copyright:
Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online from Scribd
Overall Process Program / Project Planning and Management Business Process Definition Technical Architecture Design Product Selection and Installation Dimensional Modeling
Author : Dave Goyal 3
Overall Process…Contd. Physical Design ETL Design and Development BI Application Design BI Application Development Deployment Change Management and Maintenance
Author : Dave Goyal 4
Program / Project Planning and Management Define the Project Build the Business Case and Justification Plan the Project Manage the Project Manage the Program
Author : Dave Goyal 5
Business Process Define Business Process Define Requirements using Interviews Define Requirements using Facilitated Sessions
Author : Dave Goyal 6
Technical Architecture Design Back Room Architecture (Source , ETL) Presentation Server Architecture (Dimensional Architecture) Front Room Architecture (BI) Additional Architecture Features (Infrastructure, Metadata, Security)
Author : Dave Goyal 7
Product Selection and Installation Architecture Plan (DW Architecture Diagram and Application Architecture Document) Product Selection (Hardware/OS, DBMS, ETL, BI, Data Profiling, Data Cleansing etc.)
Author : Dave Goyal 8
Dimensional Modeling Process Value Chain Business Process Choose the Business Process Declare the Grain Identify the Dimensions Identify the Facts Enterprise Bus Matrix
Author : Dave Goyal 9
Physical Design High Level Physical Design Develop Standards Develop the Physical Data Model Develop Initial Indexing Plan Design OLAP Database Design Aggregations
Author : Dave Goyal 10
ETL Table Naming Convention D_ : Dimension Table F_ : Fact Table S_ : Source Table - Contains all data copied directly from a source file X _ : Extract Table – Contains changed source data only, Changes may be from an incremental extract or derived from a full extract
Author : Dave Goyal 11
ETL Table Naming Convention 2 C_ : Clean Table – contains source rows that have been cleaned E_ : Error Table - contains error rows found in source data M_ : Master table – maintains history of all clean rows T_ : Transform Table – contains the data resulting from a transformation of source data
Author : Dave Goyal 12
ETL Table Naming Convention 3 I_ : Insert Table – contains new data to be inserted in dimension table U_ : Update Table – contains changed data to be inserted in dimension table
Author : Dave Goyal 13
Data Quality Avoid Null string in dimension tables Specify default value for NOT NULL columns – ‘N/A’, ‘Not Known’, ‘Invalid’ Dimension Primary keys should be auto generated surrogate keys. Allow data quality rows as 0, -1 , -2
Author : Dave Goyal 14
Surrogate Keys Always use surrogate keys for dimension keys as auto generate keys Use SET IDENTITY ON and SET IDENTITY OFF sql statement to create keys 0 , -1 and -2 rows for each dimension when it is created 0 : INVALID -1 : UNKNOWN -2 : NOT APPLICABLE
Author : Dave Goyal 15
ETL Design and Development Round Up the Requirements Extract Data from source (3 Steps) Clean and Conform Data (5 Steps) Delivering Data (13 Steps) Managing the ETL Environment (13 Steps)
Author : Dave Goyal 16
ETL Roadmap
Author : Dave Goyal 17
ETL Implementation Process Analyze data quality thoroughly and have options available to resolve it Define Data source definitions Create High Level S2T Map Create Detail Level S2T Map Create Fact Worksheet
Author : Dave Goyal 18
ETL Process…Extract Extract Data to S_Table (Full Load) Compare S_ to M_ table and load the difference in X_ tables Clean X Table by removing duplicate rows from X_ Table . De-duplication step Move duplicate rows to E_ Table Move non duplicate clean rows to C_ table Compare C_ to M_ and insert new into M and update M_ with changed
Author : Dave Goyal 19
ETL Process…Transform Select and Transform from C_ to T_ Compare T_ with D_ for new and changed rows Insert New rows in I_ and changed rows in U_
Author : Dave Goyal 20
ETL Process…Load (I_) Insert rows directly into D_ table from I_ Update rows from U_ to D_ when its SCD 1,3. Insert rows from U_ to D_ when its SCD 2 Please Dimension or Surrogate keys will be generated during Load stage
Author : Dave Goyal 21
ETL Process… To remember S_ , X_ , M_ , C_ , E_ tables should be named as source tables such S_Agents . T_ , I_ , U_ , or D_ table should be named as target tables such as T_Agent, T_PolicyHolder etc. Source table data size should follow source data formats except Natural keys should be varchar to accommodate data quality
Author : Dave Goyal 22
High Level BIDW System Architecture Model
Author : Dave Goyal 23
BI Application Design Define the structure of the portal and its webpages Define High Level Reporting requirements (Dashbaords, Scorecards) Define Analytical reporting requirements ( Cubes, Interactive reports, Adhoc Queries) Define Detailed reporting requirements ( Filter based reports, Adhoc queries)
Author : Dave Goyal 24
BI Application Layers
Author : Dave Goyal 25
BI Application Development Setup the development environment Setup the Issue management system Develop all reports Test and Balance each report against the source system
Author : Dave Goyal 26
Deployment / Maintenance Design Version control system Define the change management process Define the documents to deploy changes from Dev, Test, QA to Production Manage and maintain environments.