You are on page 1of 5

What are Fast Changing Dimensions?

Fast Changing Dimensions are dimensions that keep changing rapidly. Previously we had discussed about slowly changing dimension, which can be handled by Type1,Type2 or Type3.In case of rapidly changing dimension it cannot be managed by any of these Types. For RAPIDLY CHA !I ! DI"# $I% we need to split the actual dimension table into many ini Dimensions. These ini dimensions can be one ore more and will contain the Fast Changing attributes in the dimensions !o now there will be one or more mini dimensions and primary dimension table. "lso the fact table will have two or more foreign keys, one for the primary dimension table and another for the one or more mini#dimensions. $ence we will have two dimensions now, %ne table is the actual dimension minus the fast changing attributes. "nother is the mini dimensions having the fast changing attributes.

Imp&ementing Rapi'&y (hanging 'imension

This article attempts to provide some methodologies on handling rapidly changing dimensions in a data warehouse. In the past we have learnt how to design various slowly changing dimensions. &ut the problem with type ' slowly changing dimension is, with every change in the dimensional attributes it increases the number of rows in the table. If lot of changes happen in the attributes of the dimension table (that is to say that the dimension is rapidly changing), the table *uickly becomes bulky causing considerable performance issues. $ence the typical solution of !CD Type ' dimensions may not be a very good fit for rapidly changing scenarios. There are other methods to handle rapidly changing dimensions and one of those methods will be discussed in this article. &ear in mind, this is not the only method to handle rapidly changing scenarios. +either is this the best one for every kind of scenarios. " data modeler is encouraged to be innovative to come up with other novel approaches.

Junk Dimension
The method that we are going to consider here assumes the fact that, not all the attributes of a dimension table are rapidly changing in nature. There might be a few attributes which are changing

*uite often and some other attributes which seldom change. If we can separate the fast changing attributes from the slowly changing ones and move them in some other table while maintaining the slowly changing attributes in the same table, we can get rid of the issue of bulking up the dimension table. !o let,s take one e-ample to see how it works. .et,s say C/!T% 01 dimension has following columns2

C/!T% 013405 C/!T% 013+" 0 C/!T% 01360+D01 C/!T% 013 "1IT".3!T"T/! C/!T% 013TI01 C/!T% 013!T"T/!

7hile attributes like name, gender, marital status etc. do not change at all or rarely change, let,s assume customer tier and status change every month based on customer,s buying pattern. If we decide to keep status and tier in the same !CD Type ' Customer dimension table, we could risk filling# up the table too much too soon. Instead, we can pull out those two attributes in yet another table, which some people refer as 8/+4 DI 0+!I%+. $ere is how our 9unk dimension will look like. In this case, it will have : columns as shown below.

!06 0+T"TI%+3405 TI01 !T"T/!

The column !06 0+T"TI%+3405 is a surrogate key. This acts as the primary key of the table. "lso since we have removed status and tier from our main dimension table, the dimension table now looks like this2

C/!T% 013405 C/!T% 013+" 0 C/!T% 01360+D01 C/!T% 013 "1IT".3!T"T/!

+e-t, we must create a linkage between the above customer dimension to our newly created 8/+4 dimension. +ote here, we can not simply pull the primary key of the 8/+4 dimension (which we are calling as !06 0+T"TI%+3405) into the customer dimension as foreign key. &ecause if we do so, then any change in 8/+4 dimension will re*uire us to create a new record in Customer dimension to refer to the changed key. This would in effect again increase the data volume of the dimension table. 7e solve this problem by creating one more mini table in between the original customer dimension and the 9unk dimension. This mini dimension table acts as a bridge between them. 7e also put ;start date< and

;end date< columns in this mini table so that we can track the history. $ere is how our new mini table looks like2

C/!T% 013405 !06 0+T"TI%+3405 !T"1T3D"T0 0+D3D"T0

This table does not re*uire any surrogate key. $owever, one may include one ;C/110+T F."6< column in the table if re*uired. +ow the whole model looks like this2

Maintaining the Junk Dimension

If number of attributes and the number of possible distinct values per attributes (cardinality) are not very large in the 8unk dimension, we can actually pre#populate the 9unk dimension once and for all. In our earlier e-ample, let,s say possible values of status are only ;"ctive< and ;Inactive< and possible values of Tier are only ;Platinum<, ;6old< and ;!ilver<. That means there can be only : = ' > ? distinct combinations of records in this table. 7e can pre#populate the table with these ? records from segmentation key > @ to ? and assign one key to each customer based on the customers status and tier values.

How does this Junk dimension help?

!ince the connection between the segmentation key and customer key is actually maintained in the mini dimension table, fre*uent changes in tier and status do not change the number of records in the dimension table. 7henever a customer,s status or tier attribute changes, a new row is added in the mini dimension (with !T"1T3D"T0 > date of change of status) signifying the current relation between the customer and the segmentation.

It,s also worth mentioning that in this schema, we can manage the original customer dimension table in !CD type @ or Type ' methods, but we will have to take e-tra care to update the mini dimension also as and when there is a change in the key in the original dimension table.

##################################################################################################################### ###################

!urrogate key is a substitution for the natural primary key.

It is 9ust a uni*ue identifier or number for each row that can be used for the primary key to the table. The only re*uirement for a surrogate primary key is that it is uni*ue for each row in the table. Data warehouses typically use a surrogate, (also known as artificial or identity key), key for the dimension tables primary keys. They can use Infa se*uence generator, or %racle se*uence, or !A. !erver Identity values for the surrogate key. It is useful because the natural primary key (i.e. Customer +umber in Customer table) can change and this makes updates more difficult. !ome tables have columns such as "I1P%1T3+" 0 or CIT53+" 0 which are stated as the primary keys (according to the business users) but, not only can these change, inde-ing on a numerical value is probably better and you could consider creating a surrogate key called, say, "I1P%1T3ID. This would be internal to the system and as far as the client is concerned you may display only the "I1P%1T3+" 0. "nother benefit you can get from surrogate keys (!ID) is 2 Tracking the !CD # !lowly Changing Dimension. .et me give you a simple, classical e-ample2 %n the @st of 8anuary 'BB', 0mployee C0@C belongs to &usiness /nit C&/@C (thatCs what would be in your 0mployee Dimension). This employee has a turnover allocated to him on the &usiness /nit C&/@C &ut on the 'nd of 8une the 0mployee C0@C is muted from &usiness /nit C&/@C to &usiness /nit C&/'.C The entire new turnovers have to belong to the new &usiness /nit C&/'C but the old one should belong to the &usiness /nit C&/@.C If you used the natural business key C0@C for your employee within your datawarehouse everything would be allocated to &usiness /nit C&/'C even what actually belongs to C&/@.C If you use surrogate keys, you could create on the 'nd of 8une a new record for the 0mployee C0@C in your 0mployee Dimension with a new surrogate key. This way, in your fact table, you have your old data (before 'nd of 8une) with the !ID of the 0mployee C0@C D C&/@.C "ll new data (after 'nd of 8une) would take the !ID of the employee C0@C D C&/'.C 5ou could consider !lowly Changing Dimension as an enlargement of your natural key2 natural key of the 0mployee was 0mployee Code C0@C but for you it becomes

0mployee Code D &usiness /nit # C0@C D C&/@C or C0@C D C&/'.C &ut the difference with the natural key enlargement process, is that you might not have all part of your new key within your fact table, so you might not be able to do the 9oin on the new enlarge key #E so you need another id. !urrogate key by nature is number , so for 9oining or while processing comple- *uerry it will take lesser time for integer comparison comparative to character comparison. !urrogate is mainly used in slowly changing dimensions,it maintaining the uni*ueness in the is used to track the old value with the new one."nd it is derived from primary key.