You are on page 1of 3

Prune days and Change Capture in DAC

Here is a question I often get from customers regarding change capture by DAC when it is used for
extracting Siebel data. The problem is equally valid for non-Siebel data sources if the default change
capture mechanism provided in DAC is used for identifying changes in source data.

How can we ensure that data committed with an update timestamp between the start of an ETL run
and the end of the extract is not excluded by the next ETL?

 DAC stores a value called LAST_REFRESH_DATE at the end of an ETL extract.

 During the next run, DAC captures changes to the source data by comparing this
LAST_REFRESH_DATE with the LAST_UPD date of records in the source.

 So if the LAST_REFRESH_DATE is less than LAST_UPD of a source record, then the record
being processed has either been updated or created between the time of last execution and the
current run

 But what will happen if a record in the source is committed between the time of starting the data
extract and completing the extract with a commit date that is less than the

Prune days to the rescue!!!

The solution to the above problem is to set the PRUNE_DAYS parameter.

If the prune days parameter is set to 1 for example, the change capture process will use
LAST_REFRESH_DATE – Prune days as the cutoff timestamp for change capture

Here is an example of such a scenario

a) Let us assume that there is a table called S_CUSTOMER in transaction database with 2 records
created during business hours on the 1st of January.

b) On the January 1st, Extraction of Customer Data begins at 1700 hours and completes at January
1st, 1800 hours into a customer Dimension called WC_CUSTOMER.

c) LAST_REFRESH_DATE stored is January 1, 1800 hours

d) Now, a new customer is created between 1700 and 1800. But it is committed to the database
AFTER the extract is completed with a timestamp of January 1st, 17.30 am

1800 hours i.e. all the records with LAST_UPD > Dec 31. it does not qualify for the incremental extract and we have a missing record in the target table g) But if we had set the PRUNE DAYS parameter to 1.e. the cutoff date used to identify changes in source data would be . 1730 i. 1800 hours into the target table Unfortunately for the 3rd record depicted below. the LAST_UPD is Jan 1. Jan 2.e) More customers are added to the transaction database during the next working day. 1800 hours will qualify for the incremental extract and we will have a complete set of data in the target table . As a result the following is the condition of the source table at the time of the next ETL Extract i. Jan 1.e. 12:00 am f) Now the change capture process in DAC will compare the stored LAST_REFRESH_DATE with the update timestamp of the records in the source data and pull only the records updated after this LAST_REFRESH_DATE i. LAST_REFRESH_DATE – 1 day If this ploy is used. it is before the LAST_REFRESH_DATE.e.Dec 31. So.

.9 for non-Siebel Sources For source systems other than Siebel.The Prune days parameter can be used to ensure that DAC Change Capture does not miss out data created around the time of data extraction. New Feature in DAC 7. Set it with care to a reasonable value to avoid increasing the volume of data selected by incremental load process. the Prune Days setting is used in the same way except that the DAC subtracts the number of prune days from the LAST_REFRESH_DATE of a given source and supplies this as the value for the $$LAST_EXTRACT_DATE parameter.