Data Modeling Comm Pack External

Data Modeling
Communication Pack
12th December 2020
© 2020 Petroliam Nasional Berhad (PETRONAS)

All rights reserved. No part of this document may be reproduced in any form possible, stored in a retrieval system, transmitted and/or disseminated in any form or by any means (digital, mechanical,
hard copy, recording or otherwise) without the permission of the copyright owner.
Internal
Content
1. What is Data Modeling?
2. Data Modeling Work Process
3. Types of Data Model
4. Data Modeler Requirements & Deliverables
© 2019 Petroliam Nasional Berhad

2 (PETRONAS) |
Internal
What is Data Modeling?
Internal
Data modeling is the process of creating a visual representation of either a whole informa
system or parts of it to communicate connections between data points and structures.
To illustrate…
The types of data used and stored within the system
The relationships among these data types
The ways the data can be grouped and organized and its formats and attributes.

4 (PETRONAS) |
Internal
Data model is used to provide answers to business questions, hence data model design should prioritize
the business stakeholders’ perspective.
Business Perspective Technical Perspective
To express the business process or event, To ensure all data objects required by the database are
allowing business stakeholders to easily measure accurately represented. A data model helps to provides
it. The model design should enable business to a clear picture of database design at the conceptual,
see and measure their business from a different physical and logical levels.
perspective.

5 (PETRONAS) |
Internal
Values of doing Data Modelling
Quality Assurance
Just as architects consider blueprints before constructing a building, you should consider data
before building a system.
Reduce redundancies
Overview relationships and redundancies, resolve discrepancies, and integrate disparate
systems so they can work together.
Better performance
A sound model simplifies database tuning. A well-constructed database typically runs fast,
often quicker than expected.
Understanding the Business

The data and relationships represented in a data model provide a foundation on which to build
an understanding of business processes.

6 (PETRONAS) |
Internal
Data Modeling Work Process
Internal
Data Modeler provide definitions and meanings to a sets of data; like an interior designer who bring “life” and
function to an empty room.
Additional inputs to a set of data by data

Data Modeling Work Process: an analogy with Building Construction modeler.
Data normalisation
Schema selection
Data Model
© 2019 Petroliam Nasional Berhad (PETRONAS) | 8
Internal
Data modeling process is designed for clarity and portability to ensure the outcome can be effectively
communicated across all levels of the organization.
Conceptual
Understanding
Understanding Build
Business Requirement Logical
Data Data Model
Physical
The process itself inherently encourages discussion, collaboration, and careful consideration of the transformation
of business requirements into system solutions.
© 2020 Petroliam Nasional Berhad (PETRONAS) |
Internal
Data Modeler Requirements &
Deliverables
10
Internal
Data modeler require comprehensive inputs in order to produce an accurate data model.
Mandatory
• Business requirement and scope.
• Data source dictionary – describes the content, format and structure of the database.
• Business logic – translate into technical logic for physical implementation.
• Data flow diagram – visualization of business data flow.

Note: all above key info are documented in the DG Toolkit : Data Template
Optional (to help in understanding the requirement & data)

• Sample of data
• View access to source system(s)
• Business kpi (dashboards, reports, and input forms)

11
Internal
In addition to the Data Model diagram, data modeler also prepare other documents for governance
purposes and to assist Data Engineer in building the ETL and API.
Data model diagram Target data Source-to-Target Detailed design

dictionary mapping document
12
1 2 3 4
Internal
Data Modeler produces 3 type of data models that served different purposes prior to physical data
model finalisation.
Defines WHAT the system contains. Defines HOW the system should be Describes HOW the system will be
implemented regardless of the implemented using a specific DBMS
The purpose is to organize, scope and DBMS. system.
define business concepts and rules. It
should be focused on things related to The purpose is to define the structure The purpose is actual implementation
the business and its requirements. of data elements and to set of the database.
relationships between them. It adds It should be focused on how logical
further information to the conceptual data should be represented and stored
data model elements. in a particular physical database.
© 2020 Petroliam Nasional13

Berhad (PETRONAS) |
Internal
Data Normalization is a systematic approach of decomposing tables to eliminate data
redundancy(repetition) and undesirable characteristics like Insertion, Update and Deletion Anomalies..
It is a multi-step process that puts data into tabular form, removing duplicated data from the relation tables
Normalization is used for mainly two purposes:
1 2
Eliminating redundant(useless) data. Ensuring data dependencies make sense i.e data is
logically stored.
© 2020 Petroliam Nasional Berhad (PETRONAS) | 14
Internal
Removal of data Redundancy by Normalising - (1/5)
Normalisation Form
0 NF 1 NF 2 NF 3 NF BCNF 4 NF
Not normalised,
• It should be in 1 NF.
contains repeating
• It should not have
Attributes, Groups
Partial Dependency.
and etc.
• It should only have • It should be in 2 NF.

single (atomic) • It should not have
valued attributes. Transitive Dependency.
• All the columns in a
table should have
unique names.
Internal
0 NF
CustID CustName AccountManager ManagerRoom ContactName1 ContactName2

Database is
171 Siti Aminah 12 Lisa Ying Meng NOT Excel
190 Jamal Ahmad 15 Mike Johan
1 NF
CustID CustName AccountManager ManagerRoom ContactID ContactID ContactName Remove

171 Siti Aminah 12 1 Repeating
1 Lisa
Groups
171 Siti Aminah 12 2 2 Ying Meng
190 Jamal Ahmad 15 3 3 Mike
190 Jamal Ahmad 15 4 4 Johan
Internal
2 NF
CustID CustName AccountManager ManagerRoom CustID ContactID ContactID ContactName

171 Siti Aminah 12 171 1 1 Lisa
190 Jamal Ahmad 15 171 2 2 Ying Meng
190 3 3 Mike
Remove 190 4 4 Johan

Redundancy
Internal
Removal of data Redundancy by Normalising (example) - (4/5)
3 NF
CustID CustName AccountManagerID CustID ContactID ContactID ContactName

171 Siti 1 171 1 1 Lisa
190 Jamal 2 171 2 2 Ying Meng
190 3 3 Mike
Remove
Transitive 190 4 4 Johan
Dependency
AccountManagerID AccountManager ManagerRoom

1 Aminah 12
2 Ahmad 15
Internal
Removal of data Redundancy by Normalising (example) - (5/5)
Final Data Model
Customer
CustomerContact Contact
CustID – Primary Key
CustID – Foreign Key ContactID – Primary Key
CustName
ContactID – Foreign Key ContactName
AccountManagerID – Foreign Key
AccountManager
AccountManagerID – Primary Key
AccountManager
ManagerRoom
Internal
Successfulness of a Data Modeling are impacted by the quality of the ecosystem.
Business requirement Data Technology

• Without business context, data • The high-level concept of Tools or techniques used for data
modeling are meaningless. underlaying data from a business modeling, technical expert must be
stakeholder or domain expert before familiar with the data modeling
• Data modeling reflects designing data model for an technique, tools and all associated
business rules and processes, application. technologies that are currently
both business stakeholder and available to ensure what’s possible
technical expert must have • Deep dive to data definition such as for the data modeling.
same understanding of data type, data relationship and
business requirements prior to business rules governing individual
data modeling. column in a database.
• Data Modeling shouldn’t • Data quality and availability may

occur in isolation. impact the data modeling outcome.
Internal
Thank you for your passion!
Internal
Best Practice
• Ensure that you understand the level of detail (the grain) in each fact table, the dimensions indicate the level of
detail.
• Avoid including non-additive facts, which cannot be used in metric computations. For example, instead of storing
a ratio or percentage, store the facts that can be used to calculate the percentage or ratio, or instead of storing unit
prices, store the extended price (units * unit price).
• Avoid placing attributes in fact tables. Fact tables should contain facts and foreign keys to attributes stored in other
dimensions.
• Facts placed in the same fact table must be at the same level of detail (grain) and from the same business process.
• Avoid snowflake structures, which are normalized dimension tables. Dimension tables should not have other
dimensions are parents or children. You may consider denormalized dimension tables or separate the dimension
independently and connect them to a fact table.
2
Internal 2
Best Practice
• Do not connect fact tables directly to other fact tables. You should connect them through a common dimension.
• Create common dimensions that can be reused when you create additional fact tables. For example, you should
have only one dimension table for customer, one for product, one for an employee, and so on.
2
Internal 3
Design Guidance Principle – Conceptual Model
• Aiming to provide business context as to business understanding of data. Ensure business stakeholder can
understand the conceptual model.
• Start with a very high level data model, showing major entities and primary relationships only. No attributes
showing in the data model.
• Every major entity showing in the data model should have at least one relationship to another entity, relationships
appearing in the data model must clearly display.
• Every relationship has a cardinality in both direction.
• Every parent and child relationship has a cardinality of “1” on the parent end, and many “M” on the child end.
• Every supertype and subtype relationship has a cardinality of “1” on the supertype, and “1” on subtype.
• Non-specific relationship line is typically employed to model relationship entity, however it is allow to have more
refined level of details by using identifying or non-identifying relationships.
2
Internal 4
Design Guidance Principle – Logical Model
• Describe entities and attributes, and the relationships that bind them providing a clear representation of the
business purpose of the data.
• Entities and attribute are normalized in Third Normal Form (3NF). Hence, each entity has exactly one unique
record. All non-key attributes fully depend on primary key attributes, and no non-key attributes depend on any
other non-key attributes.
• Non-specific relationship line between entities will be replaced with identifying or non-identifying relationships.
• An associative entity inherits its primary key from two other entities having Many-to-Many relationships.
• Indicate cardinality relationships between two entities. This cardinality notation must convey one of the following
meanings:
• Many-to-Many cardinality (m:n)
• Many-to-One cardinality (m:1) 0..m:1
• One-to-Many cardinality (1:n) 1:0..m
• One-to-One cardinality (1:1)
2
Internal 5
Design Guidance Principle – Physical Model
Physical database design is the process of converting the detailed logical data design into a design that can be
interpreted by the database system.
• Permitted name lengths is depending on database, hence table and column names must be sized to fit the
requirements of the target database tool.
• The naming convention used must be meaningful and consistent throughout the design.
• Designate a unique primary key column for every table.
• Column name should contain all of the elements of the logical attribute from which it was derived, but should be
abbreviated to fit within the maximum length.
• Implement table and column names in a way that is supported by the database system.
• The physical model will come with a data dictionary e.g. data type, length, description, keys.
• A certain amount of denormalization is usually necessary when implementing the physical data model. De-
normalize only if you can demonstrate a performance gain. Losses in maintaining data integrity must be justified
by the performance gain.
26
Internal
Design Guidance Principle – Physical Model
• Define alternate keys that will enhance performance by supporting common search paths.
• Define security requirements for every attribute and plan for implementation of security policies.
• Detailed document the actual mechanics of converting a logical data model to a physical data model, a detailed
explanation of the migration between these two data models. Assumptions used during designing of the data model
should also be included.
• Consider performance improvement by using database feature such as indexing, caching, cluster indices and etc.
• Referential integrity rules must be defined as constraints on a named foreign key for updates or deletions. These
rules must reflect the business rules for the associated data.
2
Internal 7

Data Modeling Comm Pack External

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Data Modeling Comm Pack External

Uploaded by

Copyright:

Available Formats

Data Modeling

12th December 2020

© 2020 Petroliam Nasional Berhad (PETRONAS)

1. What is Data Modeling?

2. Data Modeling Work Process

3. Types of Data Model

4. Data Modeler Requirements & Deliverables

© 2019 Petroliam Nasional Berhad

The types of data used and stored within the system

The relationships among these data types

© 2019 Petroliam Nasional Berhad

Business Perspective Technical Perspective

© 2019 Petroliam Nasional Berhad

Understanding the Business

© 2019 Petroliam Nasional Berhad

Additional inputs to a set of data by data

© 2019 Petroliam Nasional Berhad (PETRONAS) | 8

© 2020 Petroliam Nasional Berhad (PETRONAS) |

• Business logic – translate into technical logic for physical implementation.

• Data flow diagram – visualization of business data flow.

Optional (to help in understanding the requirement & data)

• View access to source system(s)

• Business kpi (dashboards, reports, and input forms)

© 2020 Petroliam Nasional Berhad (PETRONAS) |

Data model diagram Target data Source-to-Target Detailed design

© 2020 Petroliam Nasional Berhad (PETRONAS) |

© 2020 Petroliam Nasional13

Normalization is used for mainly two purposes:

© 2020 Petroliam Nasional Berhad (PETRONAS) | 14

• It should only have • It should be in 2 NF.

© 2020 Petroliam Nasional Berhad (PETRONAS) |

CustID CustName AccountManager ManagerRoom ContactName1 ContactName2

CustID CustName AccountManager ManagerRoom ContactID ContactID ContactName Remove

190 Jamal Ahmad 15 3 3 Mike

190 Jamal Ahmad 15 4 4 Johan

© 2020 Petroliam Nasional Berhad (PETRONAS) |

CustID CustName AccountManager ManagerRoom CustID ContactID ContactID ContactName

190 Jamal Ahmad 15 171 2 2 Ying Meng

Remove 190 4 4 Johan

© 2020 Petroliam Nasional Berhad (PETRONAS) |

CustID CustName AccountManagerID CustID ContactID ContactID ContactName

190 Jamal 2 171 2 2 Ying Meng

AccountManagerID AccountManager ManagerRoom

© 2020 Petroliam Nasional Berhad (PETRONAS) |

Final Data Model

© 2020 Petroliam Nasional Berhad (PETRONAS) |

Business requirement Data Technology

• Data Modeling shouldn’t • Data quality and availability may

© 2020 Petroliam Nasional Berhad (PETRONAS) |

© 2020 Petroliam Nasional Berhad (PETRONAS) |

© 2020 Petroliam Nasional Berhad (PETRONAS) |

• Every relationship has a cardinality in both direction.

© 2020 Petroliam Nasional Berhad (PETRONAS) |

© 2020 Petroliam Nasional Berhad (PETRONAS) |

• Designate a unique primary key column for every table.

© 2020 Petroliam Nasional Berhad (PETRONAS) |

You might also like