You are on page 1of 4

During ETL, the preconfigured Informatica workflows/mappings, identified as the SDE (Source Dependent Extract) mappings, extract data

from source OLTP database tables by querying views created by DAC with names starting with V_* and load the data into staging tables on the SRMW database. The view definition will change depending on whether the ETL run is a full load or incremental load run. In case of a full load, it would be a SELECT * FROM <base table>. While in the case of an incremental load, the view definition joins the base table with the Image tables. This is done to minimize the impact and duration of the ETL process on the OLTP database. DAC drops and creates these views during each run (unless specified explicitly in one of the System properties called Drop Create Views Always). These views can be dropped any time and DAC will create them when necessary. R_IMG The S_ETL_R_IMG_* tables help identify rows in the OLTP tables that have been modified or newly created (inserted) during a period of time. For this purpose, the last update column of the source tables records is compared to the S_ETL_R_IMG_* tables to effectively extract the records that fall within the ETL extract window specified in the DAC ETL preferences. I_IMG The S_ETL_I_IMG_* tables are temporary holders that just store the row identifiers that have either changed or been newly created and are used for incremental ETL. D_IMG To capture deletes in the OLTP database, however, the change capture process uses delete triggers in conjunction with the S_ETL_D_IMG_* tables. These triggers are created on the OLTP tables through DAC. When rows are deleted from the OLTP table, the triggers capture the row identifier into the corresponding S_ETL_D_IMG table. During the Change-Capture process run by DAC, it moves the deleted row information from the D Image tables to the I Image tables with the OPERATION column set to a value of D.

nitial Tasks Performed on the Transaction Database (OLTP) SIF File and Image Tables o As part of the initial installation steps for Analytics 7.8.x, a SIF file is applied to the Siebel Transaction Database (OLTP). This creates many S_ETL_* tables including three types of image tables that are used for the change data capture process: S_ETL_D_IMG_* tables: These are delete tables that are used to capture data for rows that have been deleted in the OLTP. Rows are inserted into these D image tables via database triggers. S_ETL_I_IMG_* tables: These tables are used for all incremental changes (inserts/updates/deletes). Data is loaded to these tables from their corresponding OLTP base tables and D image tables via the DAC at the beginning of a load. S_ETL_R_IMG_* tables: These are reference tables that reflect the data that has been loaded to the SRMW. For performance reasons, only rows with last_upd within the prune period are retained in this table. (Prune days is explained later in this document.) Data is loaded into the R tables via the DAC at the end of a load. The D and R image tables have the following structure: Name Null? Type ------------------------------------ -------- -----------ROW_ID NOT NULL VARCHAR2(15 CHAR) LAST_UPD NOT NULL DATE MODIFICATION_NUM NOT NULL NUMBER(10) The I image table has the following structure: Name Null? Type ------------------------------------ -------- -----------ROW_ID NOT NULL VARCHAR2(15 CHAR) LAST_UPD NOT NULL DATE MODIFICATION_NUM NOT NULL NUMBER(10) OPERATION NOT NULL VARCHAR2(1 CHAR) SIF File and Delete Triggers o When the SIF file is applied delete triggers are created in the OLTP. They are applied to only certain tables (i.e. S_CONTACT, S_ORG_EXT, etc.). When a record is deleted from one of these tables, a row is inserted into the corresponding S_ETL_D_IMG_* table. Full Load Initially, a full load is performed to extract all required data and load all tables in the Siebel Relationship Management Warehouse (SRMW). The pre-load change data capture steps (DAC task Change Capture For Siebel OLTP) for a full load are described below (for all examples, please assume that the current run date is 2007-06-17): o Image tables (D, I and R) are truncated (for e.g. S_ETL_*_IMG_12 is for S_CONTACT) o New records are inserted into the R table, i.e. S_ETL_R_IMG_12:

INSERT /*+APPEND*/ INTO S_ETL_R_IMG_12 (ROW_ID, MODIFICATION_NUM, LAST_UPD) SELECT ROW_ID , MODIFICATION_NUM , LAST_UPD FROM S_CONTACT WHERE LAST_UPD > TO_DATE ('2007-05-18 01:00:26', 'YYYY-MM-DD HH: MI: SS) /* This is current_run date MINUS Prune Days (for example, 30 days) */ /* Prune days will be discussed later in this document */ Oddly, step this runs prior to the extraction or load of any data instead of subsequent to it. It is premature to have these rows inserted into the R table prior to the end of the load, but this is the way Siebel engineered it. o Views are dropped and recreated as: CREATE VIEW V_CONTACT AS SELECT * FROM S_CONTACT NOTE - During Full load this view is intentionally the same as the base S_% table so that all the rows in the base S_ table are extracted by the ETLs. For incremental load, this view has different SQL behind it (explained later in this document.) At end of load, when the post-load change capture step is executed (DAC task Change Capture Sync For Siebel OLTP), the views are dropped and recreated using SQL that joins the base table to the I image table: CREATE VIEW V_CONTACT AS SELECT * FROM S_CONTACT, S_ETL_I_IMG_12 WHERE S_CONTACT.ROW_ID = S_ETL_I_IMG_12.ROW_ID This is done in preparation for future incremental loads. During an incremental load, the image tables are leveraged in order to limit the number of rows extracted. Incremental Load Once a full load is successfully run, subsequent loads to the SRMW are incremental loads, meaning that only data that has changed in the source since the last run is loaded to the SRMW. During an incremental load, the ETL processes extracts this changed data by using the views on the OLTP that join the base S_ tables with their corresponding I image tables. Prune Days refers to how far back in time the customer wants to go in order to extract the changed data. The setting ensures that the OLTP rows that have a last_upd

date older than the start date/time of the prior load (A.K.A. last_refresh_date) are not missed. It is determined by customer and setup in the DAC client. For the examples that are illustrated in this document, please assume the following: prune_days = 30 last_refresh_date (prior load) = 2007-06-10 current_load = 2007-06-17 At the beginning of an incremental load the DAC executes a group of pre-load change data capture steps (DAC task Change Capture For Siebel OLTP). The steps are described below: o The I Image tables are truncated (for e.g. S_ETL_I_IMG_12 is for S_CONTACT). TRUNCATE TABLE S_ETL_I_IMG_12 o New rows are inserted into I image table for rows that have Last_Upd more recent than last_refresh_date MINUS prune_days. But before these rows are inserted, they are compared to the data in the R image table and if the modification_num and the last_upd values are the same for a particular row_id, then the row is excluded from the insert. This prevents needlessly processing rows that havent changed since the last time they were loaded into the SRMW. INSERT /*