Welcome to Scribd!

Example of ETL

Uploaded by

0% found this document useful (0 votes)

8 views6 pages

This document describes the process of extracting and transforming data to load into a data warehouse. It involves removing unnecessary attributes, handling missing data, transforming dimensions consistently, and calculating new fields. Specifically, order ID, priority, shipping mode, container, and customer name will be removed. Shipping date will be used to calculate shipping delay rather than loaded directly. Transformed data with new IDs and calculated fields like order total and unit profit will then be loaded into the data warehouse facts table.

Original Description:

Copyright

Available Formats

DOCX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

0% found this document useful (0 votes)

8 views6 pages

Example of ETL

Uploaded by

Choir choir

Copyright:

Available Formats

Download as DOCX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Jump to Page

You are on page 1of 6

Search inside document

Example

Count attribute
1 Order ID
2 Order Date
3 Order Priority
4 Order Quantity
5 Order Discount
6 Shipping Mode
7 Unit Price
8 Unit Cost
9 Shipping Cost
10 Customer Name
11 Province
12 Customer Type
13 Product Category
14 Product Name
15 Product container
16 Shipping date

This data needs to be loaded in a data warehouse with the following schema:
Data staging.
This process would save the input files on the local storage
Data extraction.
At this stage we need to remove data that is not necessary for storing in our data warehouse. We need
to remove all those attributes that are not necessary such as Order id, priority, Shipping mode,
customer name, product category, or container

Count attribute
1 Order Date
2 Order Quantity
3 Order Discount
4 Unit Price
5 Unit Cost
6 Shipping Cost
7 Province
8 Customer Type
9 Shipping date

Why did we remove the product name? It seems highly necessary for the analysis?
I agree that product name is necessary for the analysis. However, last lesson we have decided on not
using it for the purpose of this exercise due to economy of space.
Question:
Why did we not remove the shipping date? I could not find a shipping date field in the data warehouse
schema?
Shipping date will be used in data transformation stage to calculate the shipping delay.

Data Transformation
At this stage we need transform the data so that it has the same format as the schema of the data
warehouse.
Tasks should include:
 Handling missing data. Records can be removed or replaced. This depends on the specifics of
the business knowledge
 Handling all the dimensional information that exist in the input data but does not exist in the
data warehouse. For example, maybe the input data includes a customer type with the value
“international buyer”. This data does not exist in our data warehouse. The question is should
we add it or should ignore it?
 The answer is difficult because it depends on the business requirements. If the dimension
tables are shared by multiple schemas then there is a very tight control on the values of these
dimensions. For example, a customer type value might already exist but with a slight difference
in name such as“International customer”. In this case the two values need to be merged. Or
maybe, our analysis is not interested in international customers as they are very rare. In this
case the data should be ignored.
 Generating fields that need to be calculated. For example, the field unit profit needs to be
calculated from existing fields.
 In this example, for simplicity, we will ignore all the data that has new, unknown
dimensions except for dates and we will remove all data with missing values. The resulting
data should be like this with the new fields bolded and removed fields strikethrough:

Coun attribute Note

t
1 Order Date This field is replaced by an id
1 Order Date ID
2 Order Quantity
3 Order Discount
4 Unit Price
5 Unit Cost
6 Shipping Cost
7 Province This field is replaced by an id
7 Province id
8 Customer Type This field is replaced by an id
8 Customer Type
ID
9 Ship Date This field is replaced by shipping delay
9 Shipping delay Shipping date - Order Date
10 Order total (Order Quantity
* Unit Price + Shipping Cost) * (1- Order
Discount)
11 Unit profit Unit Price * (1- Order Discount)
- Unit Cost

Data Loading
This data can be loaded now in the facts table of the data warehouse.

Tagumpay Nating Lahat Satb
Document7 pages
Tagumpay Nating Lahat Satb
Choir choir
100% (5)
Above All - Satb
Document2 pages
Above All - Satb
Choir choir
No ratings yet
Choir 1 - The Greatest Show - SATB Only
Document13 pages
Choir 1 - The Greatest Show - SATB Only
Choir choir
100% (1)
BW Backend Fundamentals
Document57 pages
BW Backend Fundamentals
Ramana Chadalavada
No ratings yet
Resource Related Billing PDF
Document26 pages
Resource Related Billing PDF
Anonymous Q4a8O9
100% (1)
Week 3 Individual Case - Dakota Office Products Case Study
Document9 pages
Week 3 Individual Case - Dakota Office Products Case Study
Moriba Touray
No ratings yet
Bombardier Abbreviations and Acronyms (Aviation)
Document82 pages
Bombardier Abbreviations and Acronyms (Aviation)
戴睿
50% (2)
Default Rules OM
Document6 pages
Default Rules OM
SravaniMeessaragandaM
0% (1)
Resource-Related Billing (RRB) : Under Construction
Document34 pages
Resource-Related Billing (RRB) : Under Construction
Danielpremassis
No ratings yet
Important Tables For SAP SD: Sales and Distribution: Table Description
Document7 pages
Important Tables For SAP SD: Sales and Distribution: Table Description
Anil Kumar Bhadauria
No ratings yet
Simbang Gabi Satb
Document14 pages
Simbang Gabi Satb
Choir choir
100% (1)
Tong Tong Tong Pakitong Kitong
Document6 pages
Tong Tong Tong Pakitong Kitong
Choir choir
100% (1)
Sap Abap On Hana Cds
Document14 pages
Sap Abap On Hana Cds
Kotte Rakesh
No ratings yet
Data Warehouse and Data Modelling
Document11 pages
Data Warehouse and Data Modelling
AkashRai
No ratings yet
eCTD Basics
Document4 pages
eCTD Basics
palkybd
No ratings yet
Saging TTBB Lund
Document10 pages
Saging TTBB Lund
Choir choir
No ratings yet
How To Use BAPI
Document71 pages
How To Use BAPI
Dora Babu
100% (2)
Stroke Order Chinese Characters
Document8 pages
Stroke Order Chinese Characters
GEORGEJUNGCOCAINE
No ratings yet
Winter Soldier Arms
Document1 page
Winter Soldier Arms
Bastien Guillemat Compte II
100% (2)
SAS ETL Tool
Document19 pages
SAS ETL Tool
usha85
No ratings yet
Inquiry Quotation Contract Sales Order Delivery Billing Invoice Payment
Document4 pages
Inquiry Quotation Contract Sales Order Delivery Billing Invoice Payment
farrukhbaig
No ratings yet
Tally Erp 9.0 Material Advanced Inventory in Tally Erp 9.0
Document57 pages
Tally Erp 9.0 Material Advanced Inventory in Tally Erp 9.0
Raghavendra yadav KM
No ratings yet
RRB With DIP Profile Config
Document24 pages
RRB With DIP Profile Config
Ashit Dey
75% (4)
Designing Cloud Data Platforms
From Everand
Designing Cloud Data Platforms
Danil Zburivsky
No ratings yet
The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling
From Everand
The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling
Ralph Kimball
Rating: 4 out of 5 stars
4/5 (30)
SPEC10 - Pre-Sales Register
Document12 pages
SPEC10 - Pre-Sales Register
nagendravarmak
No ratings yet
BAPI Good GI
Document18 pages
BAPI Good GI
Jalal Masoumi Kozekanan
100% (1)
Application Requirements - DA - Feb2019
Document10 pages
Application Requirements - DA - Feb2019
Maxim Senin
100% (1)
Sap SD - Sap SD Interview Questions With Answers
Document22 pages
Sap SD - Sap SD Interview Questions With Answers
Abhinavkumar Patel
No ratings yet
24/12/2019 PO To GRN Phase SAP Integration With Portal User Requirements / Expectations Document
Document7 pages
24/12/2019 PO To GRN Phase SAP Integration With Portal User Requirements / Expectations Document
Mohit Dudhmogre
No ratings yet
Isss602 Data Analytics Lab: Assignment 2: Be Customer Wise or Otherwise
Document34 pages
Isss602 Data Analytics Lab: Assignment 2: Be Customer Wise or Otherwise
Rishi Tandon
No ratings yet
Order Entry
Document46 pages
Order Entry
bhaskarpits
No ratings yet
CSDL
Document1 page
CSDL
nguyenthanhngoc31102003
No ratings yet
Data Warehousing and Business Intelligence DS-3003 Assignment # 1
Document6 pages
Data Warehousing and Business Intelligence DS-3003 Assignment # 1
I211381 Eeman Ijaz
No ratings yet
SD Running Notes
Document67 pages
SD Running Notes
shubham
No ratings yet
Target SQL Case Study
Document17 pages
Target SQL Case Study
Rohit Gautam
100% (2)
DS Tasklist
Document5 pages
DS Tasklist
Riska Kurnianto Ska
No ratings yet
DBMS Assignment 2022
Document2 pages
DBMS Assignment 2022
Smriti Saxena
No ratings yet
Simpli Freight
Document6 pages
Simpli Freight
benate
No ratings yet
Exam 70 461 SQL Server
Document11 pages
Exam 70 461 SQL Server
jimmy_sam001
No ratings yet
Data Warehousing
Document41 pages
Data Warehousing
shadybrock
No ratings yet
Exercise 01
Document3 pages
Exercise 01
Le Thi Huong
No ratings yet
IDES MK Returnable Packaging
Document9 pages
IDES MK Returnable Packaging
aprian
100% (1)
Sap s4 Hana MM - Sto (Stock Transfer Oder)
Document18 pages
Sap s4 Hana MM - Sto (Stock Transfer Oder)
José Robles
No ratings yet
SAP BW BI DMD Corporate Credit Analysis
Document66 pages
SAP BW BI DMD Corporate Credit Analysis
venugopal_nivagani
No ratings yet
A CH03 EXPV1 H1 Instructions Annotated2015
Document2 pages
A CH03 EXPV1 H1 Instructions Annotated2015
vrgtrrtg
No ratings yet
STO Inter-Intra MM-SD Config.-End User
Document17 pages
STO Inter-Intra MM-SD Config.-End User
Sona
No ratings yet
Chapter 1 Datawarehouse
Document47 pages
Chapter 1 Datawarehouse
logeswarisaravanan
100% (1)
20bcs087 Akhil Kholia
Document28 pages
20bcs087 Akhil Kholia
Kholiator sss
No ratings yet
My Assignment 1
Document72 pages
My Assignment 1
Gaurav Shukla
100% (3)
Data Warehousing Overview: Author and Presenter: Phillip Duke
Document0 pages
Data Warehousing Overview: Author and Presenter: Phillip Duke
nandy39
No ratings yet
Search Catalog
Document55 pages
Search Catalog
priya007mishra
No ratings yet
Term-End Project - Customer Analytics by Vishal V (2020)
Document39 pages
Term-End Project - Customer Analytics by Vishal V (2020)
Johnn Sne
No ratings yet
RRB1
Document4 pages
RRB1
Debasish Banerjee
No ratings yet
CAL To-Be Order Management and Shipping Execution v.02 12-Dec-2013
Document20 pages
CAL To-Be Order Management and Shipping Execution v.02 12-Dec-2013
Tharmaraj Muralikrishnan
No ratings yet
Database Training
Document7 pages
Database Training
Gopal T
No ratings yet
Unit 3 OLAP and OLTP
Document64 pages
Unit 3 OLAP and OLTP
vikasbhowate
No ratings yet
SAD9 Ch08
Document26 pages
SAD9 Ch08
Linda Cheong
No ratings yet
Using Sessions To Save Data in A Shopping Cart Application
Document36 pages
Using Sessions To Save Data in A Shopping Cart Application
Dhungel Prabhu
No ratings yet
AP&AR VAT Return Report
Document7 pages
AP&AR VAT Return Report
Mohd Imran Ahmed
No ratings yet
Data Warehousing: Data Models and OLAP Operations: Lecture-1
Document47 pages
Data Warehousing: Data Models and OLAP Operations: Lecture-1
Bhuwan Sethi
No ratings yet
Database Management Final Report - Warehouse
Document5 pages
Database Management Final Report - Warehouse
Sasheen Dela Cruz
No ratings yet
Comp11013 Technologies For Business Intelligence: Coursework # 2
Document11 pages
Comp11013 Technologies For Business Intelligence: Coursework # 2
Ahmad Jamal
No ratings yet
BAC 223 A1 Task Sheet
Document5 pages
BAC 223 A1 Task Sheet
Alan Oprita
No ratings yet
Assignment 3 MY
Document7 pages
Assignment 3 MY
Kashfa Mahmood
No ratings yet
IDES MM Metals Scenario MTO
Document9 pages
IDES MM Metals Scenario MTO
aprian
No ratings yet
Check List For INTERVIEW
Document3 pages
Check List For INTERVIEW
Murali Krishna
No ratings yet
v3PL SAP Integration
Document6 pages
v3PL SAP Integration
Shaik Khwaja Nawaz Sharif
No ratings yet
SIA Romney Ch06
Document22 pages
SIA Romney Ch06
Dewi Sartika
No ratings yet
Monetising Data: How to Uplift Your Business
From Everand
Monetising Data: How to Uplift Your Business
Andrea Ahlemeyer-Stubbe
No ratings yet
Show Choir Costume
Document1 page
Show Choir Costume
Choir choir
No ratings yet
DBMS Lab Record 2020-21-Fin
Document35 pages
DBMS Lab Record 2020-21-Fin
Choir choir
No ratings yet
Chapter 6
Document20 pages
Chapter 6
Choir choir
No ratings yet
Concert Flow
Document1 page
Concert Flow
Choir choir
No ratings yet
CDIO Syllabus Information Management 1 Lec
Document18 pages
CDIO Syllabus Information Management 1 Lec
Choir choir
No ratings yet
Lab Exercises
Document5 pages
Lab Exercises
Choir choir
No ratings yet
Activity Proposal Excel
Document7 pages
Activity Proposal Excel
Choir choir
No ratings yet
Noted Deficiencies - Program Evaluation Report BS Information Technology Slsu 1
Document16 pages
Noted Deficiencies - Program Evaluation Report BS Information Technology Slsu 1
Choir choir
No ratings yet
Glee Club Final List
Document2 pages
Glee Club Final List
Choir choir
No ratings yet
Antillon Attendance Monitoring - csproj.FileListAbsolute
Document2 pages
Antillon Attendance Monitoring - csproj.FileListAbsolute
Choir choir
No ratings yet
YOU ARE MINE Short
Document7 pages
YOU ARE MINE Short
Choir choir
No ratings yet
Activity Proposal Word
Document7 pages
Activity Proposal Word
Choir choir
No ratings yet
SORIANOForm 3 Eligibility Form 3 in 1
Document1 page
SORIANOForm 3 Eligibility Form 3 in 1
Choir choir
No ratings yet
For This One Reason - Satb
Document9 pages
For This One Reason - Satb
Choir choir
No ratings yet
Form 2 Official Entry Form and Gallery
Document3 pages
Form 2 Official Entry Form and Gallery
Choir choir
No ratings yet
Subject Name Subject Code No. of Credits Total Contact Hours Prerequisite
Document2 pages
Subject Name Subject Code No. of Credits Total Contact Hours Prerequisite
Choir choir
No ratings yet
Have Thine Own Way Lord Satb
Document1 page
Have Thine Own Way Lord Satb
Choir choir
No ratings yet
Transistor Datasheet
Document2 pages
Transistor Datasheet
Woody Bil
No ratings yet
Frequency Domain Design Limitations
Document24 pages
Frequency Domain Design Limitations
welcome2twinkle_2272
No ratings yet
Kongsberg K-Pos Autopilot Mode: Operator Manual
Document32 pages
Kongsberg K-Pos Autopilot Mode: Operator Manual
Marv-Vic Santos
No ratings yet
855 (Poa)
Document23 pages
855 (Poa)
Jared Bu
No ratings yet
Guidance For Industry - Part 11, Electronic Records Electronic Signatures - Scope and Application
Document12 pages
Guidance For Industry - Part 11, Electronic Records Electronic Signatures - Scope and Application
Sandeep Kumar
No ratings yet
Sap Bi T - Code
Document15 pages
Sap Bi T - Code
Rahulrahul88
No ratings yet
Detai
Document11 pages
Detai
Ngô Anh
No ratings yet
Conventional Fire Alarm Panels
Document6 pages
Conventional Fire Alarm Panels
l.santu
No ratings yet
Sohan - CV - New1 (1) - Sohan Sonar
Document6 pages
Sohan - CV - New1 (1) - Sohan Sonar
Farman Kassari
No ratings yet
Johnny Greenwood Guitar Setup
Document10 pages
Johnny Greenwood Guitar Setup
ignaciobivona
No ratings yet
32LH510B SC - 1003 3756 PDF
Document44 pages
32LH510B SC - 1003 3756 PDF
Raul Contreras
100% (1)
Siemens Sinamics G150 Drive Converter Cabinet Units: Answers For Industry
Document47 pages
Siemens Sinamics G150 Drive Converter Cabinet Units: Answers For Industry
Valerio Millan Rando
No ratings yet
Working of Internet
Document4 pages
Working of Internet
Yogesh Bansal
100% (1)
Fspecial
Document7 pages
Fspecial
Ahmad Bas
No ratings yet
Unmasking Virus Writers and Hackers
Document2 pages
Unmasking Virus Writers and Hackers
Yedija Kadmiel Elnatan
No ratings yet
Veeam Agent Linux 3 0 2 Release Notes PDF
Document15 pages
Veeam Agent Linux 3 0 2 Release Notes PDF
Andrea di'Monte
No ratings yet
Sheet1: Questiontext Ans1Text Ans2Text
Document4 pages
Sheet1: Questiontext Ans1Text Ans2Text
Vinay Kumar
No ratings yet
Discrete Wavelet Transform Using Matlab
Document8 pages
Discrete Wavelet Transform Using Matlab
IAEME Publication
No ratings yet
Relays and Timers
Document133 pages
Relays and Timers
karstegg
100% (1)
4 03 02 Iep and Lesson Plan Development Handbook - Schoolhouse Document
Document42 pages
4 03 02 Iep and Lesson Plan Development Handbook - Schoolhouse Document
api-252552726
No ratings yet
Assignment Circular Motion
Document25 pages
Assignment Circular Motion
sanjibnanda
No ratings yet
Boschert Compact E
Document5 pages
Boschert Compact E
Tin Nec
No ratings yet
ESourcing Capability Model ESCM SP
Document13 pages
ESourcing Capability Model ESCM SP
alexbass2
No ratings yet
Mysql Paper 1
Document2 pages
Mysql Paper 1
Ghanshyam Sharma
No ratings yet
Intro TRW Module 02
Document35 pages
Intro TRW Module 02
Fazail Bangash
No ratings yet