Professional Documents
Culture Documents
First, let’s understand what is data migration and why we need data migration.
Why?
Data migration requirement can come due to several reasons. Below I have mentioned two of
them.
There is a table in source database (DB2) which contains customer related data which customer
microservice needs to consume in performing dedicated functions.
Table description in source database (DB2)
Developers are going to create this table in the new micro database by referring to this document.
At this point as testers we must review this document and make sure the proposed
architecture (database model) is correct.
Number of columns - Verify whether proposed database model includes all the source
table (customer) columns in DB2 which we have planned to migrate to the micro database.
Column data types – If you are NOT applying any transformation logic, data types
should be matched in the source and destination.
Column sizes – Column size mentioned in proposed architecture should be greater than or
equal to DB2 column size.
I have given an input for this point. Since we have defined “name” as
varchar [30], it’s better to increase the size of "email” column from
varchar [30] to varchar [40]. Because there is considerable likelihood
that having an email address with first name and last name. Since email
include a domain name, (e.g. @gmail.com) size of "email" field should
be greater than the "name" field.
Result – Pass
Primary Key / Foreign Key – Primary key and foreign keys (if available) should
be mentioned in proposed architecture document.
Result - Pass
Not Null – Not null columns defined in source (DB2) should be mentioned in proposed
architecture document.
Result - Pass
Result – Pass
Consistency - Columns referring to the same data should be identical everywhere in the
database. (e.g. Data type and size of the customer id in customer table should be same as the data
type and size of the customer id (foreign key) in the order table)
Result – N/A
Default value – Discussed and agreed default values should be mentioned in the
proposed database model document.
Result – Pass
Since I have spent some quality time in reviewing the proposed database model document, I have
identified some mistakes earlier which could have been a cause for an issue in future.
id VARCHAR(10),
name VARCHAR(30),
registered_date_time BIGINT,
email VARCHAR(40),
is_deleted BOOLEAN DEFAULT FALSE,
);
The column names, data types, sizes and data validations applied in this query can be verified
with the document and can approve the pull request from the tester’s end.
In this case you can approve the pull request containing the above query since it will create the
exact table which is described in proposed database model.
Okay, step two is also done. Once everything is fine with the DB scripts, these scripts can be
executed in the QA dedicated PostgreSQL DB instance.
Extract – Extracting data from source database (in our case, it is DB2)
Transformation – Apply transformation logic to change source data if needed. (in our case, we
have write a logic to concatenate "registered_date" and "registered_time" as
"registered_date_time").
Load – Load the data to the destination database (in our case, it is PostgreSQL)
We can create ETL jobs using some tools (e.g. talend) and can migrate data from source to
destination by running ETL jobs.
As testers, we should review transformation logic in ETL rather than waiting to verify migrated
data. So that we can prevent data mismatches or incorrect data in destination database.
STEP FOUR (final step) – Run the ETL job, migrate data from
source to destination and verify data accuracy to make sure
data migration is successful
Once ETL job is completed, we need to make sure all the data in source database (DB2) are in
destination database (PostgreSQL) as well. We call this as data accuracy testing.
Carried out to make sure data migration is successful.
For that, I have a checklist to verify destination data against source data.