You are on page 1of 39

INS307301 [ DATA WAREHOUSE ]

FINAL EXAM

Teacher: Nguyen Ha Nam

Name Student code Percent


Nguyễn Thị Bích Phượng 18071490 100%
Lê Thị Hằng 20071016 90%
Nguyễn Thị Minh Hằng 20071018 90%
Trần Lương Huệ Chi 16071272 65%

[AUTHOR NAME]

0
INS307301 [ DATA WAREHOUSE ]

Hanoi – 08/06/2022

MỤC LỤC

LIST OF SIGNS AND ARRIVALS ...................................................................................................... 2


LIST OF IMAGES ................................................................................................................................. 2
INTRODUCTION .................................................................................................................................. 4
CHAPTER 1: PROBLEM OVERVIEW ............................................................................................. 5
1.1. Introduction to the problem....................................................................................................... 5
1.2. OBJECTIVES AND SCOPE OF THESIS .................................................................................. 6
CHAPTER 2: SYSTEM DESIGN ......................................................................................................... 7
2.1 Design the combined entity schema (ERD) .................................................................................. 7
2.2 Building a relational database........................................................................................................ 8
CHAPTER 3: DESIGNING DATA STORAGE.................................................................................. 17
3.1 Fact table data warehouse diagram and Dimention tables .......................................................... 17
3.2 Steps to load data ......................................................................................................................... 18
PART 4: MDX LANGUAGE - EMPLOYEE DATA ANALYSIS ..................................................... 31
4.1 MDX parsing language ................................................................................................................ 31
4.2 Analyze data on the schema ........................................................................................................ 34
CHAPTER 5: CONCLUSION ............................................................................................................. 37
REFERENCES ..................................................................................................................................... 38

[AUTHOR NAME]

1
INS307301 [ DATA WAREHOUSE ]

LIST OF SIGNS AND ARRIVALS


Acronym English Vietnamese

DBMS Database Management System Hệ quản trị cơ sở dữ liệu


CSDL Database Cơ sở dữ liệu
SQL Structured Query Language Ngôn ngữ truy vấn

LIST OF TABLES
Table 2.2: Position Attribute Table.......... Error! Bookmark not defined.
Table 2.3: Branch Attribute Table ........... Error! Bookmark not defined.
Table 2.4: Producer Attribute Table ........................................................ 11
Table 2.5: Employee attribute table ......................................................... 12
Table 2.6: Attribute Table Product_Type ................................................ 12
Table 2.7: Product attribute table ............................................................. 13
Table 2.8: Invoice attribute table ............................................................. 14
Table 2.9: Attribute table detail_invoice ................................................. 14

LIST OF IMAGES
Figure 1.1: Business chart of the company ................................................ 6
Figure 2.1: Association Entity Schema ..................................................... 7
Figure 2.2 Relationship diagram ................................................................ 8
Table 2.1: Customer attribute table............................................................ 9
Table 2.2: Position Attribute Table.......................................................... 10
Table 2.3: Branch Attribute Table ........................................................... 10
Table 2.4: Producer Attribute Table ........................................................ 11
Table 2.5: Employee attribute table ......................................................... 12
Table 2.6: Attribute Table Product_Type ................................................ 13
Table 2.7: Attribute table Product............................................................ 14

[AUTHOR NAME]

2
INS307301 [ DATA WAREHOUSE ]

Table
2.8:
Invoice attribute table ......................................................................15
Table 2.9: Attribute table detail_invoice ................................................. 15
Figure 2.3: Relational schema installed on SQL SERVER ..................... 16
Figure 3.1: Data warehouse star diagram ................................................ 17
Figure 3.2: Creating a Data Flow Task .................................................... 18
Figure 3.3: Creating a data connection with the repository ..................... 19
Figure 3.4: Loading data into the warehouse........................................... 20
Figure 3.5: Loading data into the store .................................................... 20
Figure 3.5: Loading data into the store .................................................... 21
Figure 3.5: Loading data into the store .................................................... 22
Figure 3.5: Loading data into the store .................................................... 22
Figure 3.9: Loading data into the store .................................................... 23
Figure 3.10: Creating SSAS.....................................................................24
Figure 3.11: Cube diagram ......................................................................29
Figure 3.12: SSAS on Analysis Server .................................................... 30
Figure 4.1: MDX 1 query ........................................................................31
Figure 4.2: MDX 2 query ........................................................................32
Figure 4.3: MDX 3 query ........................................................................32
Figure 4.4: MDX 4 query .......................................................................33
Figure 4.5: MDX 5 query ........................................................................33
Figure 4.6: Analysis of employee data by company branch .................... 34
Figure 4.7: Analysis of employee data by sales productivity for customers 35
Figure 4.8: Analysis of employee data by products sold by the company35

[AUTHOR NAME]

3
INS307301 [ DATA WAREHOUSE ]

INTRODUCTION
Nowadays, people's need to take care of themselves is increasingly enhanced and focused,
so there are more and more cosmetic shops, to meet the needs of society. The big
distributors also grasp the growing revenue market and open many branches in many
provinces at home and abroad. The competition of companies is getting higher and
higher, so using Data to analyze trends is quite common today, and helps little for making
decisions about producing potential products, statistics local revenue and product
development. And the most difficult thing was to evaluate the employee's capacity and
analyze the potential of each employee, so our group chose the topic Employee Attrition
and Performance of a company. Overcoming mistakes in the process of employee
management for businesses, helping to make decisions based on actual data.
In this report there are 5 main parts:
Part 1: Overview of the problem
Part 2: System Design
Part 3: Data warehouse design
Part 4: MDX Language - Analyzing Employee Data
Part 5: Conclusion

[AUTHOR NAME]

4
INS307301 [ DATA WAREHOUSE ]

CHAPTER 1: PROBLEM OVERVIEW

1.1. Introduction to the problem


In order to serve the course project with Employee Attrition and Performance of a company, the
group conducted a survey on the business process of a cosmetic company about the operation
process of the company's employees.

Laura Sunshine cosmetic company specializes in trading cosmetic products, and the company
has a large number of sales staff, the employee's salary calculation will be calculated based on
the number of products the employee sells to customer’s row. In order to manage the working
process of employees, we need to manage the following information: information on products
sold by the company, each product will have its own code (Product code, product name, selling
price, etc.) description, entry price, ..).

Each product will belong to a manufacturer, each manufacturer has a unique identifier
(Manufacturer ID, manufacturer name). The company will sell many types of products, each
product type will have many more products of the same type (Product code, product type name).
The company has many branches in all districts, provinces and cities (Branch code, branch
name, address, phone number). In the company, there will be many types of positions, each
employee will have a different position (Title code, title name). Each employee will be in a
position, each position will have many employees. Each employee belongs to only one branch,
each branch has many employees. Sales of employees will depend on the amount of customer
purchases, including customer information (Customer code, customer name, address, email).

Each customer can purchase multiple invoices, each invoice will be generated by an
invoice receiver, and rewards will be awarded to the employees with the most customers.
Invoice information includes (Invoice code, total amount, date of issue). In an invoice, a
customer can buy many products with the same invoice, each product has a different purchase
quantity. Calculate employee productivity based on that employee's sales.

[AUTHOR NAME]

5
INS307301 [ DATA WAREHOUSE ]

Figure 1.1: Business chart of the company


1.2. OBJECTIVES AND SCOPE OF THESIS
 Target:
- Create a business database to manage the shop's activities

- Design a data warehouse containing the information needed for the purposes of analysis

- Retrieve data from multiple sources to the data warehouse

- Using data warehouse to give analytical charts

 Scope:
The application is suitable for testing on small and medium-sized companies first, to check the
operation of the application and see if there are any shortcomings in the operation process.

[AUTHOR NAME]

6
INS307301 [ DATA WAREHOUSE ]

CHAPTER 2: SYSTEM DESIGN


2.1 Design the combined entity schema (ERD)

Figure 2.1: Association Entity Schema

[AUTHOR NAME]

7
INS307301 [ DATA WAREHOUSE ]

2.2 Building a relational database

Building a relational database based on ERD schema

Figure 2.2 Relationship diagram

[AUTHOR NAME]

8
INS307301 [ DATA WAREHOUSE ]

Entity
type:
Customer

Attribute
STT Explain Type Size Note
name

Customer's primary
1 customer_code varchar() 20
code key

Customer
2 customer_name Nvarchar() 255
name

3 address Address Nvarchar() 255

Phone
4 number_phone Nvarchar() 255
number

5 Email Email Nvarchar() 255

6 Work Work Int

Table 2.1: Customer attribute table

Entity
type:
Position

Attribute
STT Explain Type Size Note
name

[AUTHOR NAME]

9
INS307301 [ DATA WAREHOUSE ]

Title primary
1 Pos_code varchar() 20
code key

2 Pos_name Job title Nvarchar() 255

Table 2.2: Position Attribute Table

Entity
type:
Branch

Attribute
STT Explain Type Size Note
name

Branch primary
1 Branch_code varchar() 20
code key

Branch
2 Branch_name Nvarchar() 230
name

3 address Address Nvarchar() 255

Phone
4 num_phone varchar() 11
number

Table 2.3: Branch Attribute Table

[AUTHOR NAME]

10
INS307301 [ DATA WAREHOUSE ]

Entity type:
Producer

Attribute
STT Explain Type Size Note
name

Manufacturer primary
1 Producer_code varchar() 20
code key

Manufacturer
2 Producer_name Nvarchar() 255
name

Table 2.4: Producer Attribute Table

Entity
type:
Employee

ST Siz
Attribute name Explain Type Note
T e

Employee primary
1 Emp_code varchar() 20
code key

Branch FOREIG
2 Branch_code varchar() 20
code N KEY

NVARCHA 25
3 first_name Surname
R() 5

[AUTHOR NAME]

11
INS307301 [ DATA WAREHOUSE ]

NVARCHA 25
4 Last_name Name
R() 5

25
5 Birthday DOB DATE
5

6 Sex Sex NCHAR 10

Citizen_identificati NVARCHA
7 Citizen ID 12
on R

NVARCHA 10
8 Address Address
R 0

Phone NVARCHA
9 num_phone 11
number R

NVARCHA
10 Email Email 50
R

11 salary Salary FLOAT

Manufactur FOREIG
12 Pos_code varchar 20
er code N KEY

Table 2.5: Employee attribute table

Entity type:
Product_Type

Attribute
STT Explain Type Size Note
name

[AUTHOR NAME]

12
INS307301 [ DATA WAREHOUSE ]

Product type primary


1 Type_code varchar() 20
code key

Product type
2 Type_name Nvarchar() 255
name

Table 2.6: Attribute Table Product_Type

Entity type:
Product

ST Attribute Siz
Explain Type Note
T name e

primary
1 Product_code Product code varchar() 20
key

Product FOREIG
2 Product_name varchar() 20
name N KEY

NVARCHAR(
3 Image Image 255
)

NVARCHAR(
4 Import_price Entry price 255
)

5 Price Price DATE 255

6 Describe Description NCHAR 10

Producer_cod Manufacture FOREIG


7 NVARCHAR 12
e r code N KEY

[AUTHOR NAME]

13
INS307301 [ DATA WAREHOUSE ]

Product type FOREIG


8 Type_code NVARCHAR 100
code N KEY

FOREIG
9 Branch_code Branch code NVARCHAR 11
N KEY

Table 2.7: Attribute table Product

Entity
type:
Invoice

Attribute
STT Explain Type Size Note
name

primary
1 Inv_code Code Bill varchar() 20
key

Customer's FOREIGN
2 customer_code varchar() 20
code KEY

Branch FOREIGN
3 Branch_code varchar() 20
code KEY

Employee FOREIGN
4 emp_code varchar() 20
code KEY

Total
5 Sum_price money
money

Date
6 Date_founded date
founded

[AUTHOR NAME]

14
INS307301 [ DATA WAREHOUSE ]

Table 2.8: Invoice attribute table

Entity type:
detail_invoice

Attribute
STT Explain Type Size Note
name

primary
key,
1 Inv_code Code Bill varchar() 20
FOREIGN
KEY

primary
key,
2 pro_code Product code varchar() 20
FOREIGN
KEY

Quantity
3 amount int
purchased

4 Into_money Money float

Table 2.9: Attribute table detail_invoice

[AUTHOR NAME]

15
INS307301 [ DATA WAREHOUSE ]

Figure 2.3: Relational schema installed on SQL SERVER

[AUTHOR NAME]

16
INS307301 [ DATA WAREHOUSE ]

CHAPTER 3: DESIGNING DATA STORAGE


3.1 Fact table data warehouse diagram and Dimention tables

The diagram is designed in the form of a star, the reason for choosing a star diagram to
design the group's warehouse is because the query is easy, and now the team is trying to design
the application for small and medium shops, so just using star model to design warehouse it will
be simpler than snowflake diagram, snowflake diagram should be used for large enterprises or
need to retrieve data with many aspects

Figure 3.1: Data warehouse star diagram

[AUTHOR NAME]

17
INS307301 [ DATA WAREHOUSE ]

3.2 Steps to load data

Step 1: Create a Data Flow Task named Load_Dimention

Figure 3.2: Creating a Data Flow Task

Step 2: In Load_Dimention create a connection between OLE DB Source, Excel and OLE DB
Destination

[AUTHOR NAME]

18
INS307301 [ DATA WAREHOUSE ]

Figure 3.3: Creating a data connection with the repository


Step 3: In the operational tables will create a new path for the OLE DB connection to be the
operational database, then select the required table, go to Preview to see the data created in the
operational database

[AUTHOR NAME]

19
INS307301 [ DATA WAREHOUSE ]

Figure 3.4: Loading data into the warehouse

Figure 3.5: Loading data into the store

[AUTHOR NAME]

20
INS307301 [ DATA WAREHOUSE ]

Figure 3.5: Loading data into the store

[AUTHOR NAME]

21
INS307301 [ DATA WAREHOUSE ]

Figure 3.5: Loading data into the store

Figure 3.5: Loading data into the store

Start to load data into the warehouse, because there is an overloading mechanism installed, there
will be no errors when loading new data.

[AUTHOR NAME]

22
INS307301 [ DATA WAREHOUSE ]

Figure 3.9: Loading data into the store

[AUTHOR NAME]

23
INS307301 [ DATA WAREHOUSE ]

Figure 3.10: Creating SSAS

[AUTHOR NAME]

24
INS307301 [ DATA WAREHOUSE ]

[AUTHOR NAME]

25
INS307301 [ DATA WAREHOUSE ]

[AUTHOR NAME]

26
INS307301 [ DATA WAREHOUSE ]

[AUTHOR NAME]

27
INS307301 [ DATA WAREHOUSE ]

[AUTHOR NAME]

28
INS307301 [ DATA WAREHOUSE ]

Figure 3.11: Cube diagram

[AUTHOR NAME]

29
INS307301 [ DATA WAREHOUSE ]

Figure 3.12: SSAS on Analysis Server

[AUTHOR NAME]

30
INS307301 [ DATA WAREHOUSE ]

PART 4: MDX LANGUAGE - EMPLOYEE DATA


ANALYSIS
4.1 MDX parsing language

Figure 4.1: MDX 1 query

[AUTHOR NAME]

31
INS307301 [ DATA WAREHOUSE ]

Figure 4.2: MDX 2 query

Figure 4.3: MDX 3 query

[AUTHOR NAME]

32
INS307301 [ DATA WAREHOUSE ]

Figure 4.4: MDX 4 query

Figure 4.5: MDX 5 query

[AUTHOR NAME]

33
INS307301 [ DATA WAREHOUSE ]

4.2
Analyze data on the schema

Figure 4.6: Analysis of employee data by company branch

[AUTHOR NAME]

34
INS307301 [ DATA WAREHOUSE ]

Figure 4.7: Analysis of employee data by sales productivity for customers

Figure 4.8: Analysis of employee data by products sold by the company


[AUTHOR NAME]

35
INS307301 [ DATA WAREHOUSE ]

Link website phân tích dữ liêu:

https://app.powerbi.com/groups/me/reports/c76eadd5-0651-46f5-bb7a-
80e06f75879b/ReportSectionbdbda0e8686329395c60

[AUTHOR NAME]

36
INS307301 [ DATA WAREHOUSE ]

CHAPTER 5: CONCLUSION
 In the process of making the project because of our limited time and limited knowledge, the
application still had many shortcomings, but at the same time, the team managed to fulfill the
important criteria, meet the requirements .
 The team designed an operational database to store information about the company's
operations, and designed a data warehouse containing information related to the shop's
business activities.
 Successfully transferred data from operations to the warehouse, got data from many sources
into the warehouse
 Revenue statistics in the form of graphs using MDX statements
 During the study, we sincerely thank you for helping us in the process of making the subject
project, and guiding us with good knowledge in the lecture and many external soft skills that
are very useful for students almost final year members like us.

[AUTHOR NAME]

37
INS307301 [ DATA WAREHOUSE ]

REFERENCES
Website
[1] Tạo các bảng dimension và quá trình làm sạch dữ liệu (ETL)
https://youtu.be/bqP5kJatIjc
[2] Tool import file excle
https://youtu.be/SQpBEbMyApE
[3] Vẽ biểu đồ chart trên C#
https://youtu.be/B53p_8QHHeU
[4] Viết câu truy vấn MDX
https://youtu.be/NPB2kNkS3K4
https://youtu.be/HxMQLdEHoRs
[5] Hướng dẫn cài đặt và cách sử dụng Power BI
https://www.bacs.vn/vi/blog/cong-cu-ho-tro/huong-dan-tai-va-cai-dat-power-bi-tren-may-tinh-
4993.html
[6] Load dữ liệu lên kho SSIS
https://youtu.be/eTmAn_setFI
[7] Thuật toán k-means
http://bis.net.vn/forums/t/374.aspx

[AUTHOR NAME]

38

You might also like