Professional Documents
Culture Documents
Nhóm 6 - Data Warehouse
Nhóm 6 - Data Warehouse
FINAL EXAM
[AUTHOR NAME]
0
INS307301 [ DATA WAREHOUSE ]
Hanoi – 08/06/2022
MỤC LỤC
[AUTHOR NAME]
1
INS307301 [ DATA WAREHOUSE ]
LIST OF TABLES
Table 2.2: Position Attribute Table.......... Error! Bookmark not defined.
Table 2.3: Branch Attribute Table ........... Error! Bookmark not defined.
Table 2.4: Producer Attribute Table ........................................................ 11
Table 2.5: Employee attribute table ......................................................... 12
Table 2.6: Attribute Table Product_Type ................................................ 12
Table 2.7: Product attribute table ............................................................. 13
Table 2.8: Invoice attribute table ............................................................. 14
Table 2.9: Attribute table detail_invoice ................................................. 14
LIST OF IMAGES
Figure 1.1: Business chart of the company ................................................ 6
Figure 2.1: Association Entity Schema ..................................................... 7
Figure 2.2 Relationship diagram ................................................................ 8
Table 2.1: Customer attribute table............................................................ 9
Table 2.2: Position Attribute Table.......................................................... 10
Table 2.3: Branch Attribute Table ........................................................... 10
Table 2.4: Producer Attribute Table ........................................................ 11
Table 2.5: Employee attribute table ......................................................... 12
Table 2.6: Attribute Table Product_Type ................................................ 13
Table 2.7: Attribute table Product............................................................ 14
[AUTHOR NAME]
2
INS307301 [ DATA WAREHOUSE ]
Table
2.8:
Invoice attribute table ......................................................................15
Table 2.9: Attribute table detail_invoice ................................................. 15
Figure 2.3: Relational schema installed on SQL SERVER ..................... 16
Figure 3.1: Data warehouse star diagram ................................................ 17
Figure 3.2: Creating a Data Flow Task .................................................... 18
Figure 3.3: Creating a data connection with the repository ..................... 19
Figure 3.4: Loading data into the warehouse........................................... 20
Figure 3.5: Loading data into the store .................................................... 20
Figure 3.5: Loading data into the store .................................................... 21
Figure 3.5: Loading data into the store .................................................... 22
Figure 3.5: Loading data into the store .................................................... 22
Figure 3.9: Loading data into the store .................................................... 23
Figure 3.10: Creating SSAS.....................................................................24
Figure 3.11: Cube diagram ......................................................................29
Figure 3.12: SSAS on Analysis Server .................................................... 30
Figure 4.1: MDX 1 query ........................................................................31
Figure 4.2: MDX 2 query ........................................................................32
Figure 4.3: MDX 3 query ........................................................................32
Figure 4.4: MDX 4 query .......................................................................33
Figure 4.5: MDX 5 query ........................................................................33
Figure 4.6: Analysis of employee data by company branch .................... 34
Figure 4.7: Analysis of employee data by sales productivity for customers 35
Figure 4.8: Analysis of employee data by products sold by the company35
[AUTHOR NAME]
3
INS307301 [ DATA WAREHOUSE ]
INTRODUCTION
Nowadays, people's need to take care of themselves is increasingly enhanced and focused,
so there are more and more cosmetic shops, to meet the needs of society. The big
distributors also grasp the growing revenue market and open many branches in many
provinces at home and abroad. The competition of companies is getting higher and
higher, so using Data to analyze trends is quite common today, and helps little for making
decisions about producing potential products, statistics local revenue and product
development. And the most difficult thing was to evaluate the employee's capacity and
analyze the potential of each employee, so our group chose the topic Employee Attrition
and Performance of a company. Overcoming mistakes in the process of employee
management for businesses, helping to make decisions based on actual data.
In this report there are 5 main parts:
Part 1: Overview of the problem
Part 2: System Design
Part 3: Data warehouse design
Part 4: MDX Language - Analyzing Employee Data
Part 5: Conclusion
[AUTHOR NAME]
4
INS307301 [ DATA WAREHOUSE ]
Laura Sunshine cosmetic company specializes in trading cosmetic products, and the company
has a large number of sales staff, the employee's salary calculation will be calculated based on
the number of products the employee sells to customer’s row. In order to manage the working
process of employees, we need to manage the following information: information on products
sold by the company, each product will have its own code (Product code, product name, selling
price, etc.) description, entry price, ..).
Each product will belong to a manufacturer, each manufacturer has a unique identifier
(Manufacturer ID, manufacturer name). The company will sell many types of products, each
product type will have many more products of the same type (Product code, product type name).
The company has many branches in all districts, provinces and cities (Branch code, branch
name, address, phone number). In the company, there will be many types of positions, each
employee will have a different position (Title code, title name). Each employee will be in a
position, each position will have many employees. Each employee belongs to only one branch,
each branch has many employees. Sales of employees will depend on the amount of customer
purchases, including customer information (Customer code, customer name, address, email).
Each customer can purchase multiple invoices, each invoice will be generated by an
invoice receiver, and rewards will be awarded to the employees with the most customers.
Invoice information includes (Invoice code, total amount, date of issue). In an invoice, a
customer can buy many products with the same invoice, each product has a different purchase
quantity. Calculate employee productivity based on that employee's sales.
[AUTHOR NAME]
5
INS307301 [ DATA WAREHOUSE ]
- Design a data warehouse containing the information needed for the purposes of analysis
Scope:
The application is suitable for testing on small and medium-sized companies first, to check the
operation of the application and see if there are any shortcomings in the operation process.
[AUTHOR NAME]
6
INS307301 [ DATA WAREHOUSE ]
[AUTHOR NAME]
7
INS307301 [ DATA WAREHOUSE ]
[AUTHOR NAME]
8
INS307301 [ DATA WAREHOUSE ]
Entity
type:
Customer
Attribute
STT Explain Type Size Note
name
Customer's primary
1 customer_code varchar() 20
code key
Customer
2 customer_name Nvarchar() 255
name
Phone
4 number_phone Nvarchar() 255
number
Entity
type:
Position
Attribute
STT Explain Type Size Note
name
[AUTHOR NAME]
9
INS307301 [ DATA WAREHOUSE ]
Title primary
1 Pos_code varchar() 20
code key
Entity
type:
Branch
Attribute
STT Explain Type Size Note
name
Branch primary
1 Branch_code varchar() 20
code key
Branch
2 Branch_name Nvarchar() 230
name
Phone
4 num_phone varchar() 11
number
[AUTHOR NAME]
10
INS307301 [ DATA WAREHOUSE ]
Entity type:
Producer
Attribute
STT Explain Type Size Note
name
Manufacturer primary
1 Producer_code varchar() 20
code key
Manufacturer
2 Producer_name Nvarchar() 255
name
Entity
type:
Employee
ST Siz
Attribute name Explain Type Note
T e
Employee primary
1 Emp_code varchar() 20
code key
Branch FOREIG
2 Branch_code varchar() 20
code N KEY
NVARCHA 25
3 first_name Surname
R() 5
[AUTHOR NAME]
11
INS307301 [ DATA WAREHOUSE ]
NVARCHA 25
4 Last_name Name
R() 5
25
5 Birthday DOB DATE
5
Citizen_identificati NVARCHA
7 Citizen ID 12
on R
NVARCHA 10
8 Address Address
R 0
Phone NVARCHA
9 num_phone 11
number R
NVARCHA
10 Email Email 50
R
Manufactur FOREIG
12 Pos_code varchar 20
er code N KEY
Entity type:
Product_Type
Attribute
STT Explain Type Size Note
name
[AUTHOR NAME]
12
INS307301 [ DATA WAREHOUSE ]
Product type
2 Type_name Nvarchar() 255
name
Entity type:
Product
ST Attribute Siz
Explain Type Note
T name e
primary
1 Product_code Product code varchar() 20
key
Product FOREIG
2 Product_name varchar() 20
name N KEY
NVARCHAR(
3 Image Image 255
)
NVARCHAR(
4 Import_price Entry price 255
)
[AUTHOR NAME]
13
INS307301 [ DATA WAREHOUSE ]
FOREIG
9 Branch_code Branch code NVARCHAR 11
N KEY
Entity
type:
Invoice
Attribute
STT Explain Type Size Note
name
primary
1 Inv_code Code Bill varchar() 20
key
Customer's FOREIGN
2 customer_code varchar() 20
code KEY
Branch FOREIGN
3 Branch_code varchar() 20
code KEY
Employee FOREIGN
4 emp_code varchar() 20
code KEY
Total
5 Sum_price money
money
Date
6 Date_founded date
founded
[AUTHOR NAME]
14
INS307301 [ DATA WAREHOUSE ]
Entity type:
detail_invoice
Attribute
STT Explain Type Size Note
name
primary
key,
1 Inv_code Code Bill varchar() 20
FOREIGN
KEY
primary
key,
2 pro_code Product code varchar() 20
FOREIGN
KEY
Quantity
3 amount int
purchased
[AUTHOR NAME]
15
INS307301 [ DATA WAREHOUSE ]
[AUTHOR NAME]
16
INS307301 [ DATA WAREHOUSE ]
The diagram is designed in the form of a star, the reason for choosing a star diagram to
design the group's warehouse is because the query is easy, and now the team is trying to design
the application for small and medium shops, so just using star model to design warehouse it will
be simpler than snowflake diagram, snowflake diagram should be used for large enterprises or
need to retrieve data with many aspects
[AUTHOR NAME]
17
INS307301 [ DATA WAREHOUSE ]
Step 2: In Load_Dimention create a connection between OLE DB Source, Excel and OLE DB
Destination
[AUTHOR NAME]
18
INS307301 [ DATA WAREHOUSE ]
[AUTHOR NAME]
19
INS307301 [ DATA WAREHOUSE ]
[AUTHOR NAME]
20
INS307301 [ DATA WAREHOUSE ]
[AUTHOR NAME]
21
INS307301 [ DATA WAREHOUSE ]
Start to load data into the warehouse, because there is an overloading mechanism installed, there
will be no errors when loading new data.
[AUTHOR NAME]
22
INS307301 [ DATA WAREHOUSE ]
[AUTHOR NAME]
23
INS307301 [ DATA WAREHOUSE ]
[AUTHOR NAME]
24
INS307301 [ DATA WAREHOUSE ]
[AUTHOR NAME]
25
INS307301 [ DATA WAREHOUSE ]
[AUTHOR NAME]
26
INS307301 [ DATA WAREHOUSE ]
[AUTHOR NAME]
27
INS307301 [ DATA WAREHOUSE ]
[AUTHOR NAME]
28
INS307301 [ DATA WAREHOUSE ]
[AUTHOR NAME]
29
INS307301 [ DATA WAREHOUSE ]
[AUTHOR NAME]
30
INS307301 [ DATA WAREHOUSE ]
[AUTHOR NAME]
31
INS307301 [ DATA WAREHOUSE ]
[AUTHOR NAME]
32
INS307301 [ DATA WAREHOUSE ]
[AUTHOR NAME]
33
INS307301 [ DATA WAREHOUSE ]
4.2
Analyze data on the schema
[AUTHOR NAME]
34
INS307301 [ DATA WAREHOUSE ]
35
INS307301 [ DATA WAREHOUSE ]
https://app.powerbi.com/groups/me/reports/c76eadd5-0651-46f5-bb7a-
80e06f75879b/ReportSectionbdbda0e8686329395c60
[AUTHOR NAME]
36
INS307301 [ DATA WAREHOUSE ]
CHAPTER 5: CONCLUSION
In the process of making the project because of our limited time and limited knowledge, the
application still had many shortcomings, but at the same time, the team managed to fulfill the
important criteria, meet the requirements .
The team designed an operational database to store information about the company's
operations, and designed a data warehouse containing information related to the shop's
business activities.
Successfully transferred data from operations to the warehouse, got data from many sources
into the warehouse
Revenue statistics in the form of graphs using MDX statements
During the study, we sincerely thank you for helping us in the process of making the subject
project, and guiding us with good knowledge in the lecture and many external soft skills that
are very useful for students almost final year members like us.
[AUTHOR NAME]
37
INS307301 [ DATA WAREHOUSE ]
REFERENCES
Website
[1] Tạo các bảng dimension và quá trình làm sạch dữ liệu (ETL)
https://youtu.be/bqP5kJatIjc
[2] Tool import file excle
https://youtu.be/SQpBEbMyApE
[3] Vẽ biểu đồ chart trên C#
https://youtu.be/B53p_8QHHeU
[4] Viết câu truy vấn MDX
https://youtu.be/NPB2kNkS3K4
https://youtu.be/HxMQLdEHoRs
[5] Hướng dẫn cài đặt và cách sử dụng Power BI
https://www.bacs.vn/vi/blog/cong-cu-ho-tro/huong-dan-tai-va-cai-dat-power-bi-tren-may-tinh-
4993.html
[6] Load dữ liệu lên kho SSIS
https://youtu.be/eTmAn_setFI
[7] Thuật toán k-means
http://bis.net.vn/forums/t/374.aspx
[AUTHOR NAME]
38