One of the greatest advantages of buying an OBI Application ± Project, Supply Chain or any other of the many Analytics

flavours ± is the set of predefined ETL mappings, sessions and workflows that come with it. Although there is a good chance that the OLTP data source is highly customised, the online Oracle documentation is full of information that can make the ETL developer¶s life easier. That said, there are some important ETL tasks whose logic isn¶t very easy to find. They are like black boxes: you customise them a little bit ± some fields behind the X_CUSTOM placeholder here, a small datatype change in the target table there ± and they operate their magic. So let¶s reveal the truth behind the veil of a very important out of the box ETL logic in OBI Apps: the Slowly Changing Dimension management mappings. (NOTE: The rest of this article is based on the assumption that you already know what a Slowly Changing Dimension (SCD) is and all its different types.)

Starting slow: the general structure
Recognising dimensions that Oracle decided should be SCDs in the OOTB Informatica repository is easy. In the SILOS folder, which contains all Source Independent Load mappings (i.e. loading from the Staging Area to the Reporting tables), you will find a mapping for each of these dimensions named like this: SIL_<entity>Dimension_SCDUpdate.

Figure 1: The Navigator. Aha! The Product dimension is a SCD! The structure of an SCD extract, transformation and load is essentially managed by three mappings: SDE_<entity>Dimension SIL_<entity>Dimension SIL_<entity>Dimension_SCDUpdate The first mapping is available in an Adapter folder, related to the data source you are using to feed your precious data warehouse, and looks like Figure 2. It basically joins together the main OLTP table with auxiliary ones to extract data to send to the staging table (which usually take the name of W_<entity>_DS).

which are managed throughout the exe cution of the three mappings. lookups« With regards to managing SCD Type II. update strategy transformation« Going deep: transformations. Two date fields are used to keep track of history (EFFECTIVE_FROM_DT. Just don¶t look into the mapplet« The second mapping is in SILOS.Figure 2: A typical SDE mapping. The effect range ± only the most recent records or the whole history ± depends on specific variable settings. Figure 4: The SCDUpdate mapping. The third mapping is in SILOS as well. variables. These fields are set using a series of other auxiliary fields. the OOTB Oracle approach is pretty standard.. and manages the insert and update of the rows coming from the staging tables into the final reporting table (usually called W_<entity>_D). EFFECTIVE_TO_DT) and a flag determines the most current version of a specific record (CURRENT_FLG). Dealing with surrogate keys and lookups is likely to be more complex than the SDE. Even without. See the red circles? Hic sunt leones. and its only task is to update the effective dates and a set of column values in the SCD.. er. . Figure 3: The SIL mapping.

So far so good. The SI mapping contains the bulk of the logic to update the Type II fields and populate the effective dates. plus all relevant auxiliary tables.The S E i l the E E TIVE TO tes i st i ht f the t source if il le) eeps track of the last updated date on the ain entit . This information ill then be used to evaluate if a specific row is eli ible for Type II update or not. "  ¢ ¢¢  ¢ ¢ ¡¦ ¥ ¢ ¥¤ ¡¦ ! ¢ ¥ ¡  ¤ ¢  ¨ © ¨¨   ¤ ¢ §¥¦ ¥¤ ££ ¢¡   # ¥¨ . i ure 5: Auxiliary fields in S E mapping.

igure 6: Auxiliary fields in SI mapping. else uses the key of the record to update. . % 0 ( & 1 2 1 $ . Things are getting hotter« The update strategy is decided based on the following factors: P ATE decides the surrogate primary key population as well: if it is a row to be inserted. $ 4%$ 3 epending on these conditions. Is it a new row? i.e. does the lookup into the target table retrieve anything?) ave any of the Type II fields changed? Are any of the last updated dates on the incoming row greater than the ones on the existing row? field will provide the update logic as per igure 7. the P ATE 1 2 ' 4%$ 3 ) . then takes the current value of a sequence generator. .

as stated before. If the record is a new row not marked for soft deletion PDATE_FLG =¶I¶ or µS¶). F F 8 8 . X does OT mark the spot.¶B¶ or µS¶) then Insert. then returns the session start time. returns LL. 8 B . The i 8 l e: i he es The S DUpdate mapping. else UPDATE_FLG = µU¶ or µD¶) Update. If the record is a new row marked for soft deletion µB¶). in which case it returns the original EFFE TIVE_TO date from the data source. In all remaining cases pdate Flag is null or µD¶). B 7 B . If the record is a Type II change µS¶). it returns the EFFE TIVE_FROM date of t he target row to be updated. 6B . If it is null. then returns the I_DATE_VAR either the non-defaulted EFFE TIVE_TO date from the data source. takes care of updating the historical records already present in the target table. unless the rare case of the EFFE TIVE_FROM date in input is greater than the I_DATE_VAR itself. If the record is a new one UPDATE_FLG either µI¶. then defaults it to 0 /0 / 899. If it is null. it returns the EFFE TIVE_FROM date from the data source. my suggestion is to have a thorough look at the variables/fields and write down an example. 8 EA D A A 6B 8 . or 0 /0 / 7 ). then returns the I_DATE_VAR. AA A 6B . 67 6B . The SQL Qualifier transformation contains a SQL override structured as follows: B B . EFFE TIVE_TO_DT is calculated as follows: Finally. If the record is a simple update.igure 7: All possible values of the pdate lag. 8 @59 55 B The final value of the E pdate lag as follows: E TIVE OM_DT is chosen among the auxiliary fields depending on the value of the 7 5 6 8 F 5 8 C A C A A D E D 6 8 5 . ret urns either the non-defaulted EFFE TIVE_TO date from the data source or the first not null between the I P_DELETED_ON_DT which anyway is in almost all cases NULL) and the session start time. If the record is a simple update µ ¶). If the record is to be inserted PDATE_FLG = µI¶ or µB¶). This time. it returns the greatest last updated date among all source tables involved. the SIL mapping decides the update strategy of the target reporting table as follows: lear as mud? In the likely case that the aforementioned pieces of logic are a bit cryptic.

SCD_HISTORY.Figure : The S L override in the SCDUpdate mapping. For all those who always dreamt of becoming a DBA. Note how the recordset is ordered by EFFECTIVE_FROM date in decreasing order. if the parameter is set to µY¶. In other words. selecting only the target records that have been processed in the preceeding load. Finally. The DELTA_TABLE sub-query does a first cut on the rows to retrieve. The outer sub-query. then each Type 1 change will be propagated across all records related to a specific natural key. which means that the most current records will be first for each natural key. the whole query checks the $$UDATE_ALL_HISTORY (sic!)parameter and defines if it has to con sider all historical records or only the most current ones. otherwise only the records flagged as current. H G . retrieves only records that have been historically tracked at least once.

RR ill this change with the arrival of Oracle g? Only time will tell« P I I I Q . You will notice a few little differences depending on the entity all examples here have been derived from the Product S D Dimension in SILOS .but the general strategy remains the same. that¶s basically it.Figure 9: S DUpdate fields. All this work just for a date! So. omplex? Sure is. Improvable? Possibly.

Sign up to vote on this title
UsefulNot useful