You are on page 1of 9

Effectivity Satellite & Driving

Key.

One of the most complex and often misunderstood Data


Vault 2.0 artefacts is the beloved Effectivity Satellite.

Effectivity satellites track the effectivity of a relationship


between two business objects among other applications, the
goal is to determine when the link is active according to the
business and provides begin & end dates for this purpose. This
represents the organizational-wide perspective when the
relationship (or business key) is “considered as deleted” in the
organization.
Relationship Effectivity

Although Data Vault 2.0 is INSERT-ONLY why then do we


have a start and end date in an Effectivity Satellite?

To understand this let’s first discuss relationship effectivity.

Hypothetically if two business entities we are tracking have a


relationship between each other, an account and a product (for

instance) can be related in a one-to-one or one-to-many


cardinality it stands to reason that if an account begins a relation

with a product this relationship will have a start date with No end-
date value.

To draw a parallel with slowly changing dimension (type 2) this


means the end-date is assigned a high-date like ‘9999-12-31’
meaning that until the relationship changes, when we query this
relationship it is the effective relationship until the end of time.

Source Table, 1 October 2018

Note: we are using natural keys for simpler representation of the


concepts!
When the relationship changes, the account is now related to
a different product, then we have to finish what we previously
knew about this relationship. In a slowly changing dimension
(Kimball style modelling) we would have updated that record
and inserted the new effective (or active) record.

This also occurs in an operation source and not just in a Data


Warehouse. Imagine this effectivity is being tracked in a system
like Master Data Management.

MDM uses an enterprise-type data model with relationships


between entities tracked with start and end dates; and when
sourced is loaded to Data Vault, the model will look like a bi-
temporal link-satellite.

DV_LOADDATE is done by system and will be replicated to START_DT automatically.


In this example, the content is loaded and hashed for the record
hash and persisted into the link-satellite. Now, if the source
system is doing the effectivity tracking, then this is the best
scenario because Data Vault scales naturally and simply ingests
what the source tells is happening to the account with its
relation to the product. And due to account determines the
change, we can say that Account Entity is driving of how MDM
is tracking the change.

¿How we know what is the entity that determines the


changes? First of all, we need to specify that every case is unique.
But the point in common, is to identify what is the container entity
inside the relationship, or as equal, we need to determine what is
de independent entity; and to do that we need to check the
business rule. For this particular case we do not have an explicit
one, but due all information given before, we can determine that,
between Account and Products relation, is the Product entity the
independent one inside a one to many relationship, (this is
because an Account has multiple products records, and for this
reason, the foreign key ‘Id_Product’ must be inside of Account
Entity) making Product entity as Parent & Account entity as Child.
But, wait a minute, haven´t we said that we need to identify the
container/ independent/ parent entity? Yes, we made this to
identify by default what is the contended/dependent/child
entity because this is the entity that always drives the changes…
and this is the explanation of why Account Entity determines the
changes for this particular case.

Ok let’s back to analyze & see what the effectivity satellite does for
us:

Column Description

DV_HASHKEY_LNK_ACCOUNT_PRODUCT
The parent link table’s surrogate hash key1.

DV_LOADDATE
Load date timestamp of when the record was loaded.

DV_START_DATE
Effective Start date timestamp of the relationship.

DV_END_DATE
Effective end date timestamp of the relationship.

DV_HASHDIFF
Record Hash, this is made of:
DV_START_DATE || DV_END_DATE
1
A surrogate key in a database is a unique identifier for either an entity in the modeled world or an object
in the database. It is conformed by a sequential number outside of the database that is made available to
the user and the application or it acts as an object that is present in the database but is not visible to the
user or application. However, surrogate key is not always the primary key.

Features of the surrogate key:

 It is automatically generated by the system.


 It holds anonymous integer.
 It contains unique value for all records of the table.
 The value can never be modified by the user or application.
 Surrogate key is called the factless key as it is added just for our ease of identification of unique
values and contains no relevant information.
Effectivity Satellite structure, name: sat_ef_mdm_account_product

If the source does not provide a business date tracking the change

we may have to derive this relationship movement ourselves.


Now that we have defined Account as the entity driving &
Account_id is becoming the driving key2 of the relationship,
Could we simply rely on the link structure to show the effectivity
change in the relationship? Well, Let’s see…

Day 1, 2018-10-01

2
A Driving Key is a Unique Key (or a combination thereof) that is used to determine the effectivity of a
relationship or series of relationships. The Driving Key tracks when a relationship changes, and defines
which side of the relationship determines (drives) the change. his is commonly used when a relationship is
tracked by a Business Concept (e.g. expiry information). In these scenarios, implementing the Driving Key
concept can help to close (end-date) relationships which otherwise would remain active.
Remember the effectivity satellite is only about
the relationship tracking and nothing else, on day two we show
some changes, notice how the link-satellite and effectivity satellite
have the same change iterations…

Day 2, 2018-10-10
Now, this is the tricky bit. What happens when the relationship
goes back…?

Day 3, 2018-10-15

Ah now you can see that the effectivity satellite is only about
the relationship and nothing else, the end-date is persisted but in
essence never updated. With no change to the “interest”
column the only way to detect a relationship change is in the
effectivity satellite, you cannot rely on the link table because the
link table is a unique list of relationships and therefore by
returning to the original relationship, the load date of the
previous relationship is Not the active relationship.
Note that the effectivity satellite not only solves

·     Relationships that return to a previous relationship state,

·     Business tracking of relationship effectivity,

·     but also it does not matter what cardinality the relationship is


either.

Other peripheral satellite tables like record tracking and status


tracking can live off a hub and link table, but an effectivity table
cannot.

The last point to make about effectivity satellites is this, identify


the need for a driver key early. As you saw in the illustrations
above, the effectivity satellite captures when a previous
relationship returns. What this means is, there is no other way in
data vault to capture that effectivity by relying on a link table itself,
an effectivity satellite is the only way to solve this source
system gap.

You might also like