Data Migration Strategy for AFP Reengineering Project Version 1.

0

Data Migration Strategy for AFP Reengineering Project
Version 1.0

TCS Confidential

Page 1 of 36

Data Migration Strategy for AFP Reengineering Project Version 1.0

ABOUT THIS DOCUMENT Purpose
The purpose of this document is to lay out the structure for data migration for an application reengineering project

Intended Audience
This document is primarily for the use of consultants associated with Data Migration projects

Glossary
TCS Tata Consultancy Services

TCS Confidential

Page 2 of 36

Data Migration Strategy for AFP Reengineering Project Version 1.0

Contents 1 INTRODUCTION..............................................................................4
Background...........................................................................................................4 Scope....................................................................................................................4 Assumptions.........................................................................................................5 Open Items............................................................................................................6 System Description.................................................................................................6 1.1.1 Source System Description............................................................................6 1.1.2 Target System Description.............................................................................6 Introduction...........................................................................................................8 Planning................................................................................................................9 Analysis .............................................................................................................10 2.1.1 Analysis of Source Inventory........................................................................11 2.1.2 Source Data Analysis...................................................................................12 2.1.3 Data Cleansing...........................................................................................12 2.1.4 Extraction programs....................................................................................13 2.1.5 Analysis of Target Database.........................................................................13 Strategy definition ..............................................................................................15 2.1.6 Proof of concept.........................................................................................15 Design ...............................................................................................................16 2.1.7 Mapping rules.............................................................................................17 2.1.8 Data Format – Source to Text File.................................................................17 2.1.9 Non-key source fields becoming key fields in target........................................18 2.1.10 Date and time stamp / load date fields and user id .......................................18 Construction .......................................................................................................18 2.1.11 Data migration approach............................................................................19 2.1.12 Source System (VSAM / DB2) to Staging database (Oracle)...........................19 2.1.13 Staging database (Oracle) to Target database (Oracle)..................................20 2.1.14 Cleansing.................................................................................................22 2.1.15 Audit trail data, summary data...................................................................23 2.1.16 Reports....................................................................................................24 2.1.17 Special Requirements................................................................................24 Testing................................................................................................................24 2.1.18 Validation.................................................................................................26 2.1.19 Audit.......................................................................................................27 2.1.20 Testing Lifecycle.......................................................................................27 Pre-Implementation(Dry Runs)...............................................................................28 Implementation ...................................................................................................28 2.1.21 Cutover Considerations..............................................................................31 2.1.22 Change Control.........................................................................................31 2.1.23 Traceability..............................................................................................32 2.1.24 Backup and Recovery................................................................................34

2 Migration Approach ........................................................................8

3 4 5 6

Risks ............................................................................................34 Guidelines.....................................................................................35 Recommendation.........................................................................36 Responsibility Matrix....................................................................36

TCS Confidential

Page 3 of 36

Data Migration Strategy for AFP Reengineering Project Version 1. ING has invited Tata Consultancy Services (TCS) Limited to prepare the data migration strategy document. This document details the various steps necessary for the life cycle of the data migration project that will feed the legacy data to state of the art “Oracle database”.0 1 INTRODUCTION Background ING has initiated a program to replace the existing Pension Fund Management applications running in Mainframe systems with the J2EE application. This project will replace these legacy systems with more flexible systems with up-to-date technological platforms and functionality. As part of the replacement. • Preparation Stage o o o o o • Planning Analysis Design Construction Testing Implementation Stage o o Pre-Implementation/Dry Runs Implementation/Production data migration This document also addresses • • • • Tools Cutover Considerations Proof of Concepts Guidelines TCS Confidential Page 4 of 36 . Scope The scope of this document is to define the strategy for the various phases of data migration. The phases in this data migration project are as follows. the data from the existing mainframe applications should be moved to the target Oracle database.

0 • • • • Special Requirements Change Control and Traceability Challenges and Risks Roadmap Assumptions • Target data model will be developed iteration wise and so may undergo several changes. The current strategy is to extract the data from mainframe source using Informatica power exchange and use Informatica powercenter to transform and load Oracle target database Existing master data will not be updated during migration window. The production cut-over window for implementation is expected to be 48 hours over a weekend. TCS will support and complement this. The scope of data migration project is to migrate only the data that will be accessed by the target application system ING will provide the list of concurrent activities during the outage window. design and construct scripts for Data Cleansing. ING will define the strategy. The impact of it will be studied and the outage window size will be decided • • • • • • • • • • TCS Confidential Page 5 of 36 . This could change based on the volume of the record. relationship between tables which defines the order of migration The source inventory and corresponding data are based on the assumption that the go-live date will be on a weekend that doesn’t fall on a month-end. analysis. Once the target data model is baselined unmapped fields in source will be further analyzed to confirm whether it can be actually ignored. Data to be migrated is frozen before the start of the migration There will not be any explicit lock on the data to be migrated by any of the application accessing the data during the outage window The current existing model is base lined and assumed to be 100% complete. So source data analysis has to be done based on evolving target data model.Data Migration Strategy for AFP Reengineering Project Version 1.

Data Migration Strategy for AFP Reengineering Project Version 1.1 Source System Description System Operating System 1 IBM Mainframe OS/390 Software Platform COBOL. The migration strategy of back up data when the layout is different is yet to be finalized. VSAM. The scope of migrating the data present in tapes which are rarely used by the application needs to be finalized. target table and the strategy for the same will be analyzed by ING and discussed and finalized. Both ING and TCS will discuss and resolve on the extra effort involved and the impact on the plan. Roll back strategy. The System architecture related to these systems is: 1.1. The possible solution could be one time migration either through regular interface or using scripts and then incremental migration using regular interface. • • • • System Description The scope of the data migration project is to migrate the data from the existing mainframe system to ORACLE Database.1.2 Target System Description System Operating System 1 UNIX Software Platform Java/J2EE Oracle Database TCS Confidential Page 6 of 36 . Implementation details. handling of exceptions are yet to be finalized. The feasibility of the target application system accessing the same tapes needs to be studied Risk analysis. CICS DB2 Database 1.0 Open Items • Need for migrating the historic and back up data in tapes which are not going to be accessed by the target application.

Data Migration Strategy for AFP Reengineering Project Version 1.0 TCS Confidential Page 7 of 36 .

The actual execution of the data migration programs on the production data will be done in implementation stage. The various phases involved in this endeavor are as described below. Preceding each implementation will be a Pre-Implementation or dry run to test the data migration scripts with production data in simulated test environment. data volumes and infrastructure constraints should be taken into account in the preparation stage. This data needs to be moved to target databases in Oracle. Currently source data is in VSAM and flat files and DB2 tables in Mainframe.Data Migration Strategy for AFP Reengineering Project Version 1. This will be tested in non-production environment. All the factors that influence Implementation stage like business requirements. • Preparation Stage o o o o o o • Planning Analysis Strategy Definition Design Construction Testing Implementation Stage o o Pre-Implementation/Dry Runs Implementation/Production data migration The preparation stage will be used to develop data migration strategy and the data migration programs.0 2 Migration Approach Introduction Data migration is process by which data is moved from source databases to target databases. Implementation is planned in two phases. This stage is very vital in the success of any data migration program. TCS Confidential Page 8 of 36 . This stage will be done in seven iterations and will be synchronized with the iterations in ING Core AFP Project.

0 Planning All planning activities required for data migration will be done in this phase. Other activities that will be taken up in this phase will be the finalization of source inventory. creation of standards. cleansing. strategy for data analysis. implementation and selection of tools. Assumptions • Project Plan is available Activities SL 1 2 Category Task Schedule (WeekDay) Planning Planning Conduct kick-off meeting for the phase Prepare detailed plan for the strategy documentation TCS Confidential Page 9 of 36 .Data Migration Strategy for AFP Reengineering Project Version 1.

Data cleansing requirements are documented and criteria for extraction audit and validation of source data are agreed upon. Sl 1 2 3 4 5 Process Extraction File Comparison Transformation Loading Cleansing Pre Extraction Extraction Transformation Target Database Sub-process VSAM DB2 Tools Informatica Power Exchange Informatica Power Exchange DFSORT. Finalize the list of tools & environment setup definitions Identify development/testing environment.Data Migration Strategy for AFP Reengineering Project Version 1. COBOL Informatica Power Center Source Analyzer and Warehouse Designer << ING >> <<ING >> << ING /TCS >> << ING >> Manual/SQL/Excel Informatica Informatica Reports Informatica Informatica Power Center Workflow manager 6 7 8 9 10 Data Analysis Audit Validation Reporting Scheduling Analysis Detailed analysis of source and target databases will be carried out in this phase. Identify.0 SL Category Task Schedule (WeekDay) 3 4 5 6 7 14 15 16 18 26 Planning Planning Planning Planning Planning Planning Documentation Tools Environment Configuration Data Acceptance phase Prepare detailed plan for the Iterations Consolidate source inventory.COBOL Informatica Power Center. Identify and evaluate tools for data migration Set up environment for next phase Identify candidates for Proof of Concept(POC) Document results of proof of concept (PoC) for identified candidates. TCS Confidential Page 10 of 36 . Creation of standards. Data analysis will be carried out to understand the contents of source data and documented. document and obtain approval for the configuration and reference data requirements Define Acceptance Criteria Deliverables • • Tools • Updated Project Plan Source Inventory list Inventory List for POC The tool required for various phases of data migration has been identified during POC and the list is given below.

Data Migration Strategy for AFP Reengineering Project Version 1. data and copybook layouts) are assumed to be base lined for inventory purposes. DB2 tables and flat files (structures. the data that needs to be migrated and the data that is left in source because of duplications etc. Sl 1 Description No of VSAM files in inventory Quantity 667 Link for the list List of VSAM files 2 No of DB2 tables in inventory 313 list of tables 3 No of VSAM files to be migrated 4 No of DB2 tables to be migrated 5 6 7 8 9 No of VSAM backups No of DB2 backups Volume of data Size of DB2 database Size of VSAM 25GB 245GB database is 10 No of DB2 Tables with Reference Data 11 12 13 14 15 16 No of VSAM files with Reference Data No of DB2 tables with transaction data No of VSAM files with transaction data No of DB2 tables with Master data No of VSAM files with Master Data No of Databases in the system TCS Confidential Page 11 of 36 . When data is migrated from VSAM and DB2 to Oracle.0 2. need to be identified as part of scope analysis.1. As Archive data migration will take place if archives are in current source format. their inventory needs to be documented.1 Analysis of Source Inventory The VSAM files.

3 Data Cleansing Based on the data analysis.xls As part of Standardization measure. dates . Business dependencies between the entities Understanding of multiple record layouts Technical dependencies between the entities Database specific constraints that may have potential impact on the data conversion (for example the impact of migration of COMP-3.1. as part of analysis phase. This will help us in deciding whether an unmapped source field can be ignored or not. REDFINES. Field Analysis Template. ING will provide the field description. Data cleansing will be required for Junk Characters/Characters not supported by Oracle like nulls Invalid Domain Values Domain value standardization Values not within Range of the field Format consolidation (eg. This will be done iteration wise based on the evolving target data model.2 Source Data Analysis Data analysis for all the source entities needs to be documented.1.0 2. Such domain values should be agreed upon and signed off well in advance. ING specifics or new application design). OCCURS. The following excel format is agreed upon and ING and TCS will jointly complete for all the VSAM files and DB2 table attributes and their descriptions. amount fields) Referential integrity (eg. ranges. etc.Data Migration Strategy for AFP Reengineering Project Version 1. and domain values for all the fields. from a mainframe environment to Unix/Oracle) 2. the fields that need to cleansed should be identified. consistent and complete data is loaded into target database. affiliate RUT in any transaction table should also be present in affiliate master) TCS Confidential Page 12 of 36 . the domain values of the source database may have to be standardized for target (based on international standards. The analysis should also cover the following aspects of source and target data model. Data cleansing is required to ensure that only accurate.

1. The field analysis template itself can be used for documenting cleansing requirements. Extraction rules to extract data from the source (VSAM / DB2) needs to be defined jointly by ING and TCS and the same will be incorporated in the extraction programs.5 Sl Analysis of Target Database Once the target database design is completed and baselined the following table will be updated Table Name Total Not Null Date Unique Key 1 Total Assumptions • • • Updated Project Plan is available Finalized Source inventory list for current iteration is available Target data model for current iteration is available Activities SL 1 2 3 4 5 6 Category Analysis Analysis Analysis Analysis Analysis Analysis Task Document base-lined source inventory Categorize the source entities in “Reference.1. date structures(date Schedule (WeekDay) TCS Confidential Page 13 of 36 .4 Extraction programs The extraction rules will be based on the business need and the data required for each iteration. Analyze the source and target data models for cardinality.Data Migration Strategy for AFP Reengineering Project Version 1. We also need to identify at what stage the cleansing rules can be applied (extraction . Transaction and Master” Identify candidate field. Data cleansing requirements and routines will be provided by ING.0 The cleansing requirements should be documented clearly. range/set of valid values of the identified candidate fields.optionality and relationships Understand the record identifiers for data stores with multiple layouts (Internal to COBOL programs – may be hidden in the data definition) Understand the impact of environment specific constructs like compressed data items (Comp variables in COBOL). 2. repeating data groups (Occurs clause in COBOL) . stating the present conditions and the proposed corrective action. transformation or load) 2. Analyze and understand the domains. reusage of storage space(Redefines and value clause in COBOL).

maybe Julian date) Identify System Dependencies (eg. Define general flow for migrations process (VSAM extract flat files versus master files) Review the standards for data mapping from target to source. Schedule (WeekDay) 9 10 11 12 13 Analysis Analysis Analysis Analysis Data Cleansing Deliverables • • Data analysis findings Updated Inventory list TCS Confidential Page 14 of 36 . Identify the owner for the entities that are “in question” Finalize and document the criteria for data extraction Identify the right source based on the discussion with maintenance and business team. entities that are “redundant”. Date format is Date + Time in target Oracle while it may not be the case in source) Classify the entities that “must be converted for the target”. entities that are “not required for target”.Data Migration Strategy for AFP Reengineering Project Version 1. Identify and document data cleansing requirements. Right instance of the data. entities that “must be only used for transformation”. entities that are “in question”.0 SL 7 8 Category Analysis Analysis Task may not have century part. Character set in mainframe is EBCDIC while it is ASCII in UNIX.

Further analysis to be done to check if ING core AFP system uses Julian date or not. For Occurs depending we have to manually alter the data to make it the maximum number before loading in informatica power center.PROD. 6. It is vital that any changes to the source and target baseline should be informed to the data migration team immediately.PMC321D1 CUENTAS. However it is not practical when analysis is done in iterations. 7. The data migration strategy document is prepared in this phase. It should be verified whether the informatica tool will handle it.PROD. and Redefines can be handled by Informatica power center.PROD. Usage of Power Exchange will be able to address this problem In Oracle Date is defined as YYYY-MM-DD-Time but in Vsam files it can be of any combination.PCT200D1 BENEFIC.PROD. The extraction. CUENTAS.PROD.1.EAE02M INCORPOR. Loading of DB2 null data into Oracle was found to be a problem. The changes should be immediately analysed and data analysis document updated.PPR100D1 INCORPOR.Data Migration Strategy for AFP Reengineering Project Version 1.PROD.6 Proof of concept The migration of following VSAM files and DB2 tables will be the scope for the Proof of concepts. A proof of concept has been done to validate the migration strategy for extraction. 2. 4. An extra field was manually added before every column that may contain null. If the tool does not handle it suitable solutions should be identified for migrating them to target. transformation and load. This is to hold the null indicator.PROD. Occurs .PCB150D1 BENEFIC. Mainframe uses EBCDIC while Unix uses ASCII. transformation and load will be done for these sample data in the development environment.PMC321D2 CUENTAS. 2. Informatica power center is able to handle this conversion. This document will be updated with best practices and lessons learnt after each iteration. 5. A transformation rule was written in power center to transform source date to target format We could not find any Julian dates in POC. 8. 3.0 Challenges • It is essential to baseline both source and target data models to reduce rework.COT905D1 BENEFIC. So a strategy for transforming it is not identified. VSAM 1.DESA. All environment specific constructs should be identified. During POC we have identified the following list Character set in mainframe and Unix are different.EAE03M TCS Confidential Page 15 of 36 . Usage of Power Exchange will be able to address this problem. • o o o o o o Strategy definition The various strategies related to data migration are defined in this phase.

Extraction of VSAM file to flat file and ftp to text file Extraction of DB2 to flat file and ftp to text file Mapping and transformation between source and staging tables using informatica power center Mapping and transformation between staging and target tables using informatica power center Loading of VSAM and DB2 extract flat file into staging tables using informatica power center Moving data from staging database to target database by executing the mapping and transformation scripts in informatica power center workflow 7. 5. 3.0 DB2 1. 2. 5. The following template is used for mapping repository TCS Confidential Page 16 of 36 . PER_INC_REC RECLAMO EMPLEADO DIRECCION_POSTAL DIRECCION_PERSONA The proof of concept is completed and the following is proved 1. 4. 6. The mapping rules are based on source and target data structure and domain information provided by ING. 3.Data Migration Strategy for AFP Reengineering Project Version 1. Transfer of scripts and integration between offshore and onsite Assumptions • Project Plan is available Activities SL 1 2 3 4 5 6 7 8 Category Strategy definition Strategy definition Strategy definition Strategy definition POC Review Presentation Sign-off Task Define data migration strategy Define testing strategy Define Implementation strategy Create data migration strategy document Do proof of concept Review the data migration strategy document Presentation to selected audience Obtain sign-off from Clients on the strategy documents Schedule (WeekDay) Deliverables • Data Migration Strategy Document Design The objective of this phase is to define a set of rules to transform data from source to target. The mapping repository is created to maintain list of mapping rules. 2. 4.

8 Data Format – Source to Text File VSAM to Flat file (Any COBOL Layout to Free format Layout) All the following conversions will be done by Informatica Power center itself based on the standards VSAM DATA TYPE COMP-3 COMP-2 Signed Decimal COMP Numeric Flat File Free format Signed Edited text numeric field Free format numeric display field Sign edited text field Free format Signed Edited text numeric field Numeric REMARKS DB2 to Flat file DB2 Data Type SMALLINT INTEGER DECIMAL (p.7 Mapping rules Direct mapping Identify target fields with one to one relationship with source and specify the source value to be used Transformation rule mapping For remaining target fields.1.Data Migration Strategy for AFP Reengineering Project Version 1. document transformation rule in detail. Functional and design people need to be involved in taking these kinds of decisions.9(p-s) Flat file 1 <= n <= 15 16 <= n <= 31 p – precision REMARKS TCS Confidential Page 17 of 36 . This analysis will be done only if the field is unmapped even after all iterations are completed.xls" 2. Default value mapping Identify target fields that have no relation with source and specify the default value to be populated.1. 2. Unmapped fields in source Unmapped fields in source will be analyzed and risk of not migrating these data will be estimated.0 "Mapping repository template. specifying source fields and computation clearly.s) or PIC -9(4) PIC -9(9) PIC –9(p).

0 NUMBER (p. proper integrity and the order of migration should be performed so that the complete information is retained without any data inconsistency and data redundancy. TCS Confidential Page 18 of 36 .1.s) PIC 9(p).9(p-s) s – scale 1 <= p <= 31 and 0 <= s <= p CHAR (n) PIC (n) 1 <= n <= 255 2. This phase consists of creation of extraction . transformation and load scripts for data migration. Time stamp will also be default ORACLE timestamp. Assumptions • • Baselined source and data model for the current iteration is available Data analysis findings is available Activities SL 1 2 Category Design Review Task Create mapping repository Review the mapping repository Schedule (WeekDay) Deliverables Mapping repository Construction The objective of this phase is development of data migration suite. 2.9 Non-key source fields becoming key fields in target For the source data where the non-key fields become key fields in target. Unique & non-unique constraint will be analyzed and the proper validation technique will be ascertained.Data Migration Strategy for AFP Reengineering Project Version 1.1.10 Date and time stamp / load date fields and user id Date will be ORACLE format of mm/dd/ccyy with default value set by the business. For load dates field and update user id field the date when the loading/migration is done and a default User Id will be assigned. so that there is no undefined information in the system. Proper indexes will be defined in the target system so that the access time is within the SLA.

The impact of having Informatica Power Exchange on extraction process will be analyzed and the same will be updated in this document after iteration 1. REPRO JCL’s to extract the VSAM files into flat files will be written. 2.11 Data migration approach The data migration would occur in 2 stages. 3.1. Following diagram depicts the data migration steps: Su e o rc 2.1. DB2 Unload JCL’s to extract the DB2 tables’ data into flat files will be written. In the first stage data will be migrated from the source systems to the staging ORACLE database in the same layout as the file layouts. The JCL should also contain a step to FTP the flat file in binary format to FTP Server. Logical grouping of the files in one JCL should be determined and standardized Steps for extraction of DB2 tables 1. 3.0 2.Data Migration Strategy for AFP Reengineering Project Version 1.1 Extract The strategy of extraction given here is without Informatica Power Exchange.12 Source System (VSAM / DB2) to Staging database (Oracle) 2. Temporary variable to be used in the JCL and the name of the table and the load file to be hard coded at only one place 2.12.1. Temporary variable to be used in the JCL and the name of the file to be hard coded at only one place. In the second stage we will move the data from the staging database to the target ORACLE database. The JCL should also contain a step to FTP the flat file in binary format to FTP Server. The data from VSAM file and DB2 table is extracted by the following steps Steps for extraction of VSAM files 1. Logical grouping of the tables in one JCL should be determined and standardized VA SM JC L CB OO TCS Confidential Page 19 of 36 .

5.CBL” extension will be written 2. Indexes will be created based on the performance requirements 2. The referential integrity will not be maintained in this database.13. These will include replacing junk characters by blanks. excluded records. • Transferring the text files from mainframe to UNIX environment will be performed by typical ftp. The mapping rules are defined and linked between the source and target in the mapping designer 2. Using the source descriptions the target table descriptions (Staging Oracle db) will be defined in Informatica Warehouse designer 5.Data Migration Strategy for AFP Reengineering Project Version 1.0 Pre processing . 2. The document will be updated accordingly. The compressed files will be transferred through UNIX box to Informatica server. bad records.1.Informatica Power center 1. The staging target tables are created in the database Note: Any compatibility issues between Mainframe data and loading data into Infomatica Power center will be analyzed and the extraction process may have an impact. The COBOL format programs with copybook names with “. The reusable sessions are created which will define the mapping 2. The file to be transferred will be split into number of files and split files will be compressed by PKZIP software.3 Load The following are steps involved in loading the data from VSAM and DB2 to staging oracle database. This will report the details of the rejected records. The transformations rules are designed and scripted in Transformation developer 3. The copybooks in the same folder with ".1 Extract TCS Confidential Page 20 of 36 . 1.CPY” extension will be copied 3. substituting zero for a numeric field. The number of workflows will be decided based on the sequence of the migration 4. The source descriptions will be defined in Informatica Source Analyzer 4. The workflow is executed to load the data from the source to the staging database.1. In UNIX the files will be decompressed with the help of PKUNZIP software and will be loaded into Informatica server. The following are also done as part of the extraction process • Some degree of data cleansing activity will be performed as a part of the extraction process.13 Staging database (Oracle) to Target database (Oracle) 2.1. The workflow is created in the Workflow Manager which will define which session needs to be executed and sequence and time of execution 3.1.12. 2.2 Transform The following are the steps that needs to be followed in informatica power center 1. and bad data. Some degree of data cleansing activity will be performed as a part of the extraction process. • Reporting mechanism on each extraction process will also be developed.12.

0 The data from staging database is not extracted but it is physically represented as mapping and transformation and the Informatica Power center picks it up from staging database to the target database. The referential integrity will be maintained in this database and hence the data loading will have to be performed based on the defined loading sequence. The source descriptions (Staging database needs to be defined as source )will be defined in Informatica Source Analyzer 2. Pre processing .1. The transformations rules are designed and scripted in Transformation developer 3.13.1. The data will be ported to Informatica server through UNIX box. The workflow is created in the Workflow Manager which will define which session needs to be executed and sequence and time of execution 3. This approach will save a lot of time during the implementation TCS Confidential Page 21 of 36 . 6.2 Transform The following are the steps that needs to be followed in informatica power center 1.13.Data Migration Strategy for AFP Reengineering Project Version 1. The target tables will be available in the database already created by the application team 2.Informatica Power center 1. Additional indexes may also be necessary to improve the performance requirements Note: For the input source data that does not require cleansing in staging will be migrated directly to the target. The reusable sessions are created which will define the mapping 2. Cleansing activities will also be done here. Indexes will be created based on the Target database schema requirements.3 Load The following are steps involved in loading the data from staging oracle database to target oracle database. 4. 2. The analysis of cleansing for the files plays a major role in deciding this strategy. 5. 1. Using the logical target database design the target table descriptions (Target Oracle db) will be defined in Informatica Warehouse designer 3. The workflow is executed to load the data from the staging database to target database. 4. The mapping rules are defined and linked between the source and target in the mapping designer 2.

1. Attention fields and any other fields that are not participants in business validation. which is cleaned. Comments. TCS Confidential Page 22 of 36 . The types of data that will be cleaned are:  Name: These are entity properties data like. The main purpose of cleaning the data directly in production is to avoid any cleaning activities of the similar data in the subsequent migration. like.14.1 Pre Migration (Production Phase) This process will cleanse all the non voluminous and business non-critical data. Dealer name.1. description. Bank name DSSO Name. will remain clean throughout the different phases of migration. Customer address. Customer name.0 2.  Comment: These are entity attributes data. Therefore the data.Data Migration Strategy for AFP Reengineering Project Version 1.14 Cleansing 2.

Note: If cleansing will be done during transformation and staging. Cleansing at transformation include while transforming data from source to staging and also while transforming staging to target. customer name standardization.3 During Transformation The major part of the data cleansing rules is applied in this stage.14. Record rejection.14.14. Summary data will contain all type of key information for migration of a particular entity.4 In staging Some level of cleansing will be done on the data present in the staging table. Null. address standardization.1. Date) Cleaning of junk character User identified incorrect data 2. These include:  Inconsistency in Business  Domain value (ZIP code. RUT)  Unmapped data 2. then ING and TCS has to analyze the impact on the effort involved and the changes to the plan. The data that are cleaned in this process are:     Technical Data (does not need any Business intervention) Default Data (Handling of Space. 2.1. 2.Data Migration Strategy for AFP Reengineering Project Version 1. Name of the affiliate and any other information that are critical to the entity will be considered. TCS Confidential Page 23 of 36 . Either Java programs or SQL will be written to clean the data present in staging.15 Audit trail data. This will be re-validated in the target system to confirm the correctness of File transfer.2 Extraction Process This level cleansing process is to clean voluminous business non-critical data. RUT.0  Appropriation: These are any standardization data like. This also includes the data that are routine and static clean.1. record appropriation can also be included in the Audit data. summary data Audit trail data will contain total number of records migrated and summation of any numeric field.1. For example for Affiliate Master.

0 2. Testing will check all the transformations / mappings / workflows / cleansing / audit and validations.16 Reports Exceptions reports will be analyzed. Individual test cases need to TCS Confidential Page 24 of 36 . gap will be studied and new/changed data cleansing definition will be incorporated. Use of any reporting tool will be analyzed and finalized 2.17 Special Requirements Any special requirements that arise as part of Data Analysis will be documented and updated frequently Assumptions • • • Baselined source and target data model for the current iteration is available Mapping repository available Data cleansing requirements available Activities SL 1 2 3 4 5 6 7 Category Construction Construction Construction Construction Data Cleansing Validation Audit Task Extraction routines to be written for extracting data from mainframe Source and target definitions to be created in informatica power center using information from source and target Data model Mapping and transformation rules are created in informatica power center based on the information collected in mapping repository Session and workflows are created using power center for executing the mapping and transformation rules.1. to the verify the correctness of migration (Business Validation) Finalize and document the criteria to verify the completeness of migration (Technical Validation) Schedule (WeekDay) Deliverables • • • • • Extraction routines Source and target definitions Mapping and transformation rules Load routines (Sessions and workflows) Audit and validation routines Testing This phase comprises of testing the data migration suite for each iteration. Data cleansing rules are also written if required in this stage Finalize and document the criteria for data validation.1. Both the rule definition and programs will be configured.Data Migration Strategy for AFP Reengineering Project Version 1.

Max and Default values should be provided in and verified and validated Any specific alert requirements should be specified in the ETL strategy to incorporate the same in development ING users to specify critical fields involving complex calculations and the same should be incorporated Any specific audit requirements should be specified in the ETL specs and will be incorporated Incoming control summary file specification to be provided and same Integrity checks Outlier conditions Alert mechanism Correctness of calculations Audit trail Identify data to be captured in audit trail ( Eg. The following matrix illustrates the broad areas that the test cases will pertain to – Attributes Measurement plan Remarks Business important fields for checksum Business rules 1. Capture audit attributes during load process and store in Audit table Incoming data summary Identify summary information for input data to be sent in additional file (File name. Raise alerts Identify fields involving complex calculations Recalculate once loading is complete Match with previously calculated values Business important fields that can be used for checksums need to be requested to ING users and it should be included in the extracts.File name. date and Perform checks on incoming data ( Match record TCS Confidential Page 25 of 36 . load 2. number of records.g. Perform summations on the identified fields in incoming data files and match the sum 3. failed integrity checks. Records inserted from file. Maximum and default values for data attributes All data attributes should contain a valid value Raise alert when invalid values are detected 1. Identify all business important fields that can be used for summation checks for data extracts and in target tables 2. outliers. Perform summations on identified fields in the ODS and match with that of incoming data All data elements to be mapped to business rules All data elements and relationship should pass associated business rules (E..There can be no detail records in the absence of a master) Identify the Minimum.Data Migration Strategy for AFP Reengineering Project Version 1. Identify all steps which need to generate alert (egInvalid incoming data. Identify all integrity constraints 2.Data attribute can contain only one out of a set of values) 1.0 be prepared for testing out various functionalities. All data must pass through associated integrity constraints(Eg. All Business rules should be provided by ING users and TCS will do a feasibility analysis for the same Integrity constraints should be specified by ING users and verified and validated Min. number of records on file.

Table-1 Group By count Count Field-1 Count Field-2 Count Field-1 Count Field-2 Count Field-1 Table-1 Table-2 All Entities Field-1. Table-1 Field-1. Table-3 Field-1. Table-1 Field-2. Table-2 Field-1. Table-1 Field-1. Table-1 Field-1.Data Migration Strategy for AFP Reengineering Project Version 1. The choice of the business criteria can be identified from the legacy reports or may be provided by ING should be incorporated in the extract Information on critical reports to be provided by ING 2. No Category Source Destination Criteria 1 2 Number of physical record Sum All Entities Field-1. Bussines test cases could be writing SQL queries to get the data from target and verify it using existing mainframe data. Table-1 Field-2. Folio Number) Name Fields Round off Group by range Compare by key Verify correct TCS Confidential Page 26 of 36 . Table-1 Field-1. Table-1 Group By count Count Field-1 Count Field-2 Count Field-1 Count Field-2 Count Field-1 Table-1 Table-2 Group by range Compare by key Verify correct Exact match or deviation justified Exact match or deviation justified Exact match or deviation justified Exact match or deviation justified Exact match or deviation justified Exact match or deviation justified Exact match or deviation justified Exact match Exact match Exact match Exact match Justify Exact match Exact match Exact match 3 4 5 6 7 8 9 10 11 12 13 14 16 Sum against a branch Total number of active affiliates Total number of deceased affiliates Totals Status Fields Null fields Blank Fields Not Null Fields Duplicate Rows Deleted Rows Key fields (RUT. match Business test cases Raise alert in case of mismatch TCS will write 40 to 50 test cases to check the bussiness scenario for audit.1. Table-1 Field-1. Table-1 Field-1. Table-2 Field-1. Table-1 Field-1.18 Validation The following are the validations that will be performed to ensure the correctness of the data migrated.0 Attributes Measurement plan Remarks count in control file and actual number of files received.

20 Testing Lifecycle • The construction and unit testing will be done by TCS onsite/offshore team after the finalization of design document. transformation and load routines) Audit and validation routines available Source Data for migration is available Activities SL 1 2 Category Testing Audit and validation Task Test the data migration suite Run the Audit and validation scripts and verfy the completeness and correctness of data migration.Data Migration Strategy for AFP Reengineering Project Version 1. TCS will support these two types of testing. This will be going forward basis. Business reports will be provided by ING to be used for auditing No Category Source Destination Criteria 2. Schedule (WeekDay) TCS Confidential Page 27 of 36 . TCS will support the testing. The components will be validated by ING Data Migration team.19 Audit Audit rules are expected to be defined by the ING Core AFP Data Migration team in the following format. • • • Assumptions • • • Data migration suite available (Extraction.1. The Migration components will be delivered by TCS upon completion of construction/unit testing.0 No Category Source Destination Criteria 17 18 19 Truncation Error on Identified Field Exceptions Bad Records decimal places Correct truncation Defined Identify decimal places Correct truncation Defined Defined Exact match Validate Validate 2.1. After this primary validation a bigger Revolution Unit Testing will be performed by ING data Migration team after ING Core AFP application is delivered. The auditing should be done based on the reliable reports from business. • After the completion of this phase Performance Testing and Revolution Unit Testing will be performed in parallel.

This will be done by ING Data Migration team and the Business Capability Team. The implementation of data migration depends mainly on the implementation window. infrastructure requirements and to fine tune the programs and implementation procedures if required. Data migration implementation is planned in two phases. in terms of the window for data migration. • The strategy describes the go-no-go checkpoints after different stages and a Root cause Analysis (RCA) will be done for each checkpoints.0 Deliverables Tested data migration suite Pre-Implementation(Dry Runs) Pre-Implementation or Dry run is the simulation of production implementation in test environment. Assumptions • • • Tested Data migration suite available for the current implementation phase Test environment that is simulated based on production is available Source Data for pre-implementation dry run is available Activities SL 1 2 3 Category Pre implementation Audit and validation Performance Task Test the data migration suite Run the Audit and validation scripts and verfy the completeness and correctness of data migration. Performance tuning of data migration suite if required Schedule (WeekDay) Deliverables Full volume Tested data migration suite Implementation Implementation phase comprises of activities for implementing the actual production data migration. Design. Based on further analysis the document will be updated for the implementation of phase 2 roll out TCS Confidential Page 28 of 36 . Data Model. Migration component codes will be revisited and necessary actions will be taken. TCS will support this testing.Data Migration Strategy for AFP Reengineering Project Version 1. Migration Design. As of now the implementation of phase 1 roll out alone is considered. volume of data to be migrated and the type of data. Based on the RCA the Data Mapping. So Preimplementation run will also be done for each of these phases. The objective is to understand the complexities during implementation. On further analysis on data and discussions the implementation strategy will be finalized.

All the back up data be extracted in 48 hours 2.Data Migration Strategy for AFP Reengineering Project Version 1. All the reference data be extracted in 48 hours 3. transformed and loaded in 48 hours 4. Testing of the data will be in the parallel run time 8. It is assumed that data migrated on first weekend is not going to change at all 5. cleaned. All the transaction. The cleansing implementation will have impact on the strategy defined here and this document will be updated based on the cleansing implementation Following source files of ING Core AFP System split according to the modules and the best strategy and time for migrating these data will be tabularized in the following format once the approach is finalized. The same is depicted in the figure below 17 th 17 th Sep Sep 22 nd Oct 22 nd Oct 28 tt 28 S S Points to be considered to adopt this approach: 1. # The following no of records and database sizes are based on the available information in production. Sl System Data # ( M ) Volu me (GB) Vertical Split (By Design) Horizont al Split Special Treatment Link for the list of Tables/Files Proposed date of Migration 1 2 3 Contracts Contracts Contracts Transa ction Master Refere Back up Back up files files Migration Migration R R TCS Confidential Page 29 of 36 .0 All the back up data will be migrated two weeks ahead and the reference data will be migrated one week ahead and the transaction and master data will be done in one weekend before live. Additional effort is involved in doing catch up for reference data 7. Incremental migration may be required for reference data Note: Data cleansing implementation is not considered. master and catch up reference data be extracted. It is assumed that data migrated on second weekend (reference data) may not change in one week 6.

0 Sl System Data # ( M ) Volu me (GB) Vertical Split (By Design) Horizont al Split Special Treatment Link for the list of Tables/Files Proposed date of Migration 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 Accounts -1 Accounts -1 Accounts -1 Claims -1 Claims -1 Claims -1 Accounts -2 Accounts -2 Accounts -2 Claims -2 Claims -2 Claims -2 Pensions Pensions Pensions Bonds Bonds Bonds nce Transa ction Master Refere nce Transa ction Master Refere nce Transa ction Master Refere nce Transa ction Master Refere nce Transa ction Master Refere nce Transa ction Master Refere nce Assumptions • Full volume Tested Data migration suite available for the current implementation phase Activities SL 1 2 Category Task Schedule (WeekDay) Implementation Implementation Backup the source data to be migrated if required Backup the target data in phase 2 implementation as target database would be operational between phase 1 TCS Confidential Page 30 of 36 .Data Migration Strategy for AFP Reengineering Project Version 1.

Stop applying regulatory changes in the last 1 month before implementation After the completion of the batch cycle final backups need to be taken. These type of files which are not going to be modified can be migrated two weeks ahead All the contracts related files should go live in the month end only Data related to deceased can be migrated well in ahead as they are not going to be modified All data pertaining to closed claims can be migrated well in ahead The data related to inactive affiliate can be migrated well in ahead as they are not going to be modified Weekend cutover will not have any issue. Go-live on weekday will require the record locked Stop online users to do any maintenance transaction in the last 3-4 days before implementation. 2. Resolve and reconcile any errors encountered Invoke fallback procedures if unable to Resolve and reconcile any errors encountered Make the target application go live Deliverables Data migrated to target table as per data migration requirements. 2.21 Cutover Considerations Sl 1 2 3 4 5 6 6 7 8 9 Candidate Master Files Quarterly Back up files Contracts Deceased Data Closed Claims Inactive affiliate DB2 tables Maintenance Changes Regulatory Changes Final Backups prior to migration Issue Weekend cutover will not have any issue. Load scripts) Resolve and reconcile any data errors Execute the Audit and validation scripts and verfy the completeness and correctness of data migration. Go-live on weekday will require the files to be kept on hold. This will make the database more static.1. Cleansing.0 SL Category Task Schedule (WeekDay) 3 4 5 6 7 8 Implementation Implementation Audit and validation Implementation Implementation Implementation and phase 2 implementation Execute the data migration suite (Extraction.1.Data Migration Strategy for AFP Reengineering Project Version 1.22 Change Control Scope of Change Control TCS Confidential Page 31 of 36 . transformation.

Data Migration Strategy for AFP Reengineering Project Version 1.23 Traceability The traceability of the data migration artifacts (documents and programs) are to be traced from target fields to the audit and validation routines in the ING Core AFP system.1. The following diagram depicts the traceability requirements in different stages.0 Sl 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Artifacts Source Database Schema Source Data Target Data Model Extraction Rules Extraction Programs Extraction Jobs/Schedules Extracted Data on mainframe Transfer Programs Transformation Rules Transformation Programs Transformation Jobs/Schedules Transferred Data (in UNIX) Conversion Database Transformed Data Loading Programs Loaded Data Loading Jobs/Schedules Cleansing Rules Cleansing Programs/Scripts Cleansing Report Validation Rules Validation Programs Validation Reports Test Case (Unit/Integration) Test Script (Unit/Integration) Test Result (Unit/Integration) Test Report (Unit/Integration) Audit Rules Audit Programs Audit Reports Owner Business Analyst Team Business Analyst Team Business Analyst Team Business Analyst Team Technical Team Technical Team Technical Team Technical Team Business Analyst Team Technical Team Technical Team Technical Team Technical Team Technical Team Technical Team Technical Team Technical Team Business Analyst Team Technical Team Technical Team Technical Team Technical Team Technical Team Technical Team Technical Team Technical Team Technical Team Business Analyst Team Technical Team Technical Team Repository Formal 2. TCS Confidential Page 32 of 36 .

0 Tagt ar e g Tr et Sl 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Trace Filed Number Target Field Target Table Source Field Source Table / File Clean Rule Clean Program Clean Report Extract Rule Extract Program Extract Job Extract Schedule Transfer Spec Transfer Program Transfer Jobs Transfer Schedule Transform Spec Transform Program Transfer Job Transfer Schedule Clean Rule Clean Program Clean Report Load Spec Load Program Load Job Load Schedule Clean Rule Clean Program Clean Report Test Case (Unit/Integration) Tracing to Sorc ou e c Sure <Table Number>-<Field Number> C C Following example can be used as a template for traceability matrix T be al T be al R Fed ils Fed ils Page 33 of 36 P TCS Confidential .Data Migration Strategy for AFP Reengineering Project Version 1.

24 Backup and Recovery The backup and recovery process for the version controlled artifacts will be placed in Rational Clear Case tool under corresponding folders. The strategy developed for the data migration has been modularized to enable restart of the process at any stage of failure. This window might get reduced. The frequency of the back ups will depend on the type of the artifact. • • • Delay in Source Inventory analysis by ING Delay in Data Cleansing activities by ING The production cut-over window for implementation is expected to be 48 hours over a weekend. Stop the migration. TCS Confidential Page 34 of 36 . This may impact on defining mapping rules and transformation rules and the whole migration process. Write the exception in separate file and continue with the migration without inserting that record 3. Write the exception in separate file and continue with the migration by inserting the record with pre defined values. Stop the migration.Data Migration Strategy for AFP Reengineering Project Version 1. The possible ways of handling exceptions are 1. The exception handling during outage window will be decided based on the business criticality of the data that is migrated. 4. solve it and restart the migration from the last commit point 3 • Risks Target Database Design is not available on time. Analyze the exception.0 Sl 31 32 33 34 35 36 37 38 39 Trace Test Script (Unit/Integration) Test Result (Unit/Integration) Test Report (Unit/Integration) Validation Rule Validation Script Validation Report Audit Rule Audit Script Audit Report Tracing to 2.1. Delete everything and start from the first 2.

• The scripts written for extraction and all other activities needs to be written in standard formats and back ups taken periodically. TCS Confidential Page 35 of 36 . Database.Data Migration Strategy for AFP Reengineering Project Version 1. Informatica. OS) Major changes in source due to SAFP Regulatory Changes Change in the layout of the files 4 • • Guidelines Pre extraction data cleansing is advisable for voluminous non-critical data Vertical split of the source data is preferable only when design of the target data model demands it. Link.0 • • • • • Environment readiness Cutover window – Network. Vertical split to handle the voluminous data is not recommended. Extended Production Window Software Version Change (Oracle. • Non Critical and back up data can be migrated two weeks before the system goes live.

This portion will include only the data that are unlikely to change. The complete migration life cycle (extraction to loading on target data base) has been designed with redundancy and modularity. Candidate field analysis for all the source data is recommended upfront to identify the potential data-cleansing requirement. 7. 4. To minimize the risk. Several cutoff issues are related to the handling of the interfaces during the cutover window. 6 Responsibility Matrix "Responsibility matrix. the iterative development model of ING Core AFP project definitely demands for an iterative construction and unit testing phase for the data migration programs. business capability team and maintenance team. This will provide the proper insight into the project as well as help in change control mechanism. 6. 2. the implementation strategy assumed a one off data migration on the weekend and incremental build for five days. migration team. 8.xls" TCS Confidential Page 36 of 36 . Target data model for conversion database is yet to be firmed up. The DM POC can be used as a contingency plan for the complete implementation. 5. The data migration for the phase I and phase II will be assumed to be two separate implementations.0 5 Recommendation 1. 3. The assumption of month-end-weekend implementation of the whole ING Core AFP Data Migration may not hold good. We recommend formal weekly interaction among the interface team. However. Once we have the firmed up target model. We recommend a comprehensive traceability matrix based on the target fields on target database. The fine line between the several interfaces and the cutoff scenario of data migration has to be properly monitored. This is to ensure that in every logical break point one can commit or restart. . The data related to active accounts will be migrated on the production cutover weekend.Data Migration Strategy for AFP Reengineering Project Version 1. the mapping can be started.

Sign up to vote on this title
UsefulNot useful