Tasks you need to do to perform data warehouse performance testing There are essentially three steps to carry out

performance testing on the ETL processes: con-figure the test environment run the bulk load processes, and run the incremental processes. Let's go through these three steps one by one: 1. First you need to configure the data arehouse test servers as per the production con-figuration. !ou create the data stores and set up the ETL processes" that is, you migrate the ##$# packages from development to test server. Then you need to connect the test server to the test version of the source system %sometimes kno n as &'( and set up all necessary connectivity and access. !ou load the test data into the &' environment. !ou run the hole ETL processes end to end using a small load to verify that the test environment has been configured properly 2. !ou run the initial load ETL processes %be it T-#&L scripts or ##$# packages( to populate the arehouse and measure their performance, that is, hen each task in the $nitial)bulk load EFL processes started and hen each one completed. *ne ay to measure the performance is by utili+ing the ETL log and the timestamp column of the loaded table. Let me e,plain hat $ mean by the ETL log and the timestamp column. The ETL log is a table or file here the ETL system rites to. $t is sometimes kno n as the audit table or file. $n each ETL task or process, after each ma-or step the ETL routine rites a ro %$f it is a table( or a line %if it is a file(. $n this ETL log, containing the essential statistics of the task, such as the data volume %number of records affected by the operation(, the task name, and the current time. The timestamp column contains the time the ro as loaded or modified. .oth the ETL log and the timestamp columns are useful to measure the performance of the incremental ETL and bulk load because you can determine the duration of each ETL task. They are also useful for troubleshooting the root cause or finding the bottle-neck of the performance problem as ell as for doing fine-tuning in the data are-house performance. !ou can do this by finding out the long-running)longperforming tasks, and then you try to make these tasks more efficient. .oth take a little bit of time to implement %probably /0 minutes to an hour for each task( because you need to pro-gram each Ell task to rite into the log, but they are

ecuted. or eekly 3in a particular se4uence according to the prepared test scenario and measure their performance %speed(. to measure performance you can script the actions. Then you run the ETL batches3 hether $t is daily. so they are orth the effort.ecuted in ##$#. $n incremental ETL. E. using both normal scenarios and e. ##$# logs runtime events as they are e. and data volume 8 T! procedures: re4uires the definition of a reference data volume for the operational :ata to be loaded . 6easuring them manually means you use the system interactively.ception scenarios.ception scenarios are steps the users perform that result in errors. hether they're reports. Performance should be separately tested for: 8 Database: re4uires the definition of a reference orkload in terms of number of 9oncurrent users.tremely useful because they help identify slo -running tasks %$ have seen a 10 percent increase in performance(. !ou then play the recording back to simulate many people using the system simultaneously hile recording the system performance. and ##$# logging. $n application performance testing. you use the applications. ho many minutes after the transaction as made in the source system did the data arrive in the data arehouse5 The previous steps are about performance testing for ETL. For this you need to enable logging on each task in all ##$# packages in the bulk load processes. 'nother ay to measure the performance is using #&L #erver 2rofiler. and data volume 8 Front-end: re4uires the definition of a reference orkload in terms of number of 9oncurrent users. 3. mining models cubes. or . !ou set up #&L #erver 2rofiler to log the time each #&L statement is e. you need to measure the time lag as ell" that is. types of 4ueries. you can also use the same methods)instruments as the bulk load to measure the performance: ETL log. For applications. 7ormal scenarios are steps that the users perform in regular daily activities. !ou can also utili+e ##$# logging. hourly.e. or you can measure them manually. timestamp. #cripting the actions means you use soft are to record the actions that a user does hen using the applications. #&L #erver 2rofiler.$ applications. For real-time ETL. types of 4ueries.

<ariables that should be considered to stress the system are: 8 :atabase: number of concurrent users.pressed in terms of response time. and data volume 8 ETL procedures: data volume The e. "tress test: simulates an e..pected 4uality level can be e. .traordinary orkload due to a significantly Larger amount of data and 4ueries. types of 4ueries. types of 4ueries and data volume 8 Front-end: number of concurrent users.