You are on page 1of 4

ETL Testing

1. Challenges of Data warehouse Testing


--Data selection from multiple source systems and analysis that follows pose great challenge. --Volume and the complexity of the data. --Inconsistent and redundant data in a data warehouse. --Inconsistent and Inaccurate reports. --Non-availability of History data.

2. Testing Methodology
--Use of raceability to enable full test coverage of !usiness "e#uirements --In depth review of est $ases --%anipulation of est Data to ensure full test coverage &ig ' esting %ethodology (V- %odel) *rovision of appropriate tools to speed the process of est +xecution , +valuation "egression esting

3. Testing Types
he following are types of esting performed for Data warehousing pro-ects. '. .. /. 2. 3. 5. 3.1 Unit Testing he ob-ective of Unit testing involves testing of Business transformation rules, error conditions,
mapping fields at staging and core le els.

Unit esting. Integration esting. echnical 0ha1edown esting. 0ystem esting. 4peration readiness esting User 6cceptance esting.

--Unit testing involves the following '. $hec1 the %apping of fields present in staging level. .. $hec1 for the duplication of values generated using 0e#uence generator. /. $hec1 for the correctness of surrogate 1eys7 which uni#uely identifies rows in database. 2. $hec1 for Data type constraints of the fields present in staging and core levels. 3. $hec1 for the population of status and error messages into target table. 5. $hec1 for string columns are left and right trimmed. 8. $hec1 every mapping needs to implement the process abort %applet which is invo1ed if the number of record read from source is not e#ual to trailer count. 9. $hec1 every ob-ect7 transformation: source and target need to have proper metadata. $hec1 visually in data warehouse designer tool if every transformation has a meaningful description.

3.2 !ntegration Testing he ob-ective of Integration esting is to ensure that wor1flows are executed correct dependency. --Integration testing involves the following '. o chec1 for the execution of wor1flows at the following stages 0ource to 0taging 6. 0taging 6 to 0taging !. 0taging ! to $ore. .. o chec1 target tables are populated with correct number of records. /. *erformance of the schedule is recorded and analysis is performed on the performance result. 2. o verify the dependencies among wor1flows between source to staging7 staging to staging and staging to core is have been properly defined. 3. o $hec1 for +rror log messages in appropriate file. 5. o verify if the start -ob starts at pre-defined starting time. +xample if the start time for first -ob has been configured to be at ';<;;6% and the $ontrol-% group has been ordered at 86%7 the first -ob would not start in $ontrol-% until ';<;;6%. 8. o chec1 for restarting of =obs in case of failures. 3.3 Technical "ha#edown Test Due to the complexity in integrating the various source systems and tools7 there are expected to be several teething problems with the environments. 6 echnical 0ha1edown est will be conducted prior to commencing 0ystem esting7 0tress , *erformance7 and User 6cceptance testing and 4perational "eadiness est to ensure the following points are proven< > Hardware is in place and has been configured correctly (including Informatica architecture7 0ource system connectivity and !usiness 4b-ects). > 6ll software has been migrated to the testing environments correctly. > 6ll re#uired connectivity between systems are in place. > +nd-to-end transactions (both online and batch transactions) have been executed and do not fall over. 3.$ "ystem Testing he ob-ective of 0ystem esting is to ensure that the re#uired business functions are implemented correctly. his phase includes data verification which tests the #uality of data populated into target tables. 0ystem esting involves the following '. o chec1 the functionality of the system meets the business specifications. .. o chec1 for the count of records in source table and comparing with the number of records in the target able followed by analysis of re-ected records. /. o chec1 for end to end integration of systems and connectivity of the infrastructure (e.g. hardware and networ1 configurations are correct)7 2. o chec1 all transactions7 database updates and data flows functions for accuracy. 3. o validate !usiness reports functionality. as scheduled with

%%&eporting functionality 6bility to report data as re#uired by !usiness using !usiness 4b-ects "eport 0tructure 0ince the universe and reports have been migrated from previous version of !usiness 4b-ects7 it?s necessary to ensure that the upgraded reports replicate the structure@format and data re#uirements (until and unless a change @ enhancement has been documented in "e#uirement raceability %atrix @ &unctional Design Document). 'nhancements data in data mart. %%(erformance 6bility of the system to perform certain functions within a prescribed time. hat the system meets the stated performance criteria according to agreed 0A6s or specific non-functional re#uirements. 0ecurity hat the re#uired level of security access is controlled and wor1s properly7 including domain hat the est security7 profile security7 Data 0ecurity7 UserID and password control7 and access procedures. User )ccessi*ility +nhancements li1e reports? structure7 prompts ordering which were in scope of upgrade he data displayed in the reports @ prompts matches with the actual pro-ect will be tested Data 6ccuracy

security system cannot be bypassed. Usability. hat the system is useable as per specified re#uirements. hat specified type of access to data is provided to users $onnection *arameters the connection Data provider $hec1 for the right universe and duplicate data $onditions@0election criteria est the for selection criteria for the correct logic 4b-ect testing est the ob-ects definitions $ontext testing +nsure formula is with input or output context Variable testing est the variable for its syntax and data type compatible &ormulas or calculations formula +ilters )lerts "orting est the data has filter correctly $hec1 for extreme limits "eport alerts est the sorting order of 0ection headers fields7 bloc1s est the formula for its syntax and validate the data given by the

otals and subtotals validate the data results Universe 0tructure Integrity of universe is maintained and there are no divergences in terms of -oins @ ob-ects @ prompts 3., User )cceptance Testing he ob-ective of this testing to ensure that 0ystem meets the expectations of the business users. It aims to prove that the entire system operates effectively in a production environment and that the system successfully supports the business processes from a userBs perspective. +ssentially7 these tests will run through Ca day in the life ofD business users. he tests will also include functions that involve source systems connectivity7 -obs scheduling and !usiness reports functionality. 3.- .perational &eadiness Testing /.&T0 his is the final phase of testing which focuses on verifying the deployment of software and the operational readiness of the application. he main areas of testing in this phase include< Deployment est

'. ests the deployment of the solution .. ests overall technical deployment Cchec1listD and timeframes /. ests the security aspects of the system including user authentication and user-access levels. .perational and Business )cceptance Testing '. ests the operability of the system including -ob control and scheduling. .. ests include normal scenarios7 abnormal7 and fatal scenarios authoriEation7 and

$ Test Data
Fiven the complexity of Data warehouse pro-ects: preparation of test data is daunting tas1. Volume of data re#uired for each level of testing is given below. Unit esting - his phase of testing will be performed with a small subset (.;G) of production data for each source system. !ntegration Testing - his phase of testing will be performed with a small subset of production data for each source system. "ystem Testing H his phase of a subset of live data will be used which is sufficient in volume to contain all re#uired test conditions that includes normal scenarios7 abnormal7 and fatal scenarios but small enough that wor1flow execution time does not impact the test schedule unduly.

, Conclusion
Data warehouse solutions are becoming almost ubi#uitous as a supporting technology for the operational and strategic functions at most companies. Data warehouses play an integral role in business functions as diverse as enterprise process management and monitoring7 and production of financial statements. he approach described here combines an understanding of the business rules applied to the data with the ability to develop and use testing procedures that chec1 the accuracy of entire data sets. his level of testing rigor re#uires additional effort and more s1illed resources. However7 by employing this methodology7 the team can be more confident7 from day one of the implementation of the DI7 in the #uality of the data. his will build the confidence of the end-user community7 and it will ultimately lead to a more effective implementation.