Database (Data) testing: The comprehensive guide

The GUI is in most cases given the most emphasis by the respective test managers as well as the
development team members since the Graphical User Interface happens to be the most visible part
of the application. However what is also important is to validate the information which can be
considered as the heart of the application aka DATABASE.
Let us consider a Banking application whereby a user makes transactions. Now from database
testing viewpoint following things are important:

1. The application stores the transaction information in the application database and
displays them correctly to the user.
2. No information is lost in the process.
3. No partially performed or aborted operation information is saved by the application.
4. No unauthorized individual is allowed to access the users information.
To ensure all these above objectives we need to use data validation or data testing.
Fundamental differences between user-interface and data testing


User-Interface testing Database or Data testing
This type of testing is also known as Graphical user
Interface testing or Front-end Testing.
This type of testing is also known as Back-end
Testing or data testing.
This type of testing chiefly deals with all the
testable items which are open to the user for
viewership and interaction like Forms, Presentation,
Graphs, Menus, and Reports etc. (created through
VB, VB.net, VC++, Delphi - Frontend tools )
This type of testing chiefly deals with all the
testable items which are generally hidden
from the user for viewership. These include
internal process and storage like Assembly,
DBMS like Oracle, SQL Server, MYSQL etc.
This type of testing include validating the
text boxes,
select dropdowns,
calendars and buttons,
navigation from one page to another,
display of images as well as
Look and feel of the overall application.
This type of testing involve validating
the schema ,
database tables ,
columns ,
keys and indexes ,
stored procedures ,
triggers ,
database server validations ,
validating data duplication ,
The tester must be thoroughly knowledgeable about
the business requirements as well as the usage of
the development tools and the usage of automation
framework and tools.
The tester in order to be able to perform back
end testing must have strong background in
database server and Structured Query
Language concepts.
Types of database testing

The 3 types of Database Testing are
1. Structural Testing
2. Functional Testing
3. Non-functional Testing
Lets look into each type and its sub-types one by one.
Structural database testing
The structural data testing actually involves the validation of all those elements inside the
data repository which are used primarily for storage of data and which are not allowed to be
directly manipulated by the end users. The validation of the database servers is also a very
important consideration in these types of testing. The successful completion of this phase by
the testers involves mastery in SQL queries.
Schema testing
The chief aspect of schema testing is to ensure that the schema mapping between the front
end and back end are similar. Thus we may also refer to schema testing as mapping testing.
Let us discuss most important checkpoints for schema testing.
1. Validation of the various schema formats associated with the databases. Many times
the mapping format of the table may not be compatible with the mapping format
present in the user interface level of the application.
2. There is need for verification in the case unmapped tables/views/columns.
3. There is also a need to verify whether heterogeneous databases in an environment are
consistent with the overall application mapping.
Let us also look at some of the interesting tools for validating database schemas.
 DBUnit which is integrated with Ant is a very suitable for mapping testing.
 SQL Server allows the testers to be able to check and to query the schema of the
database by writing simple queries and not through code.
For example, if the developers want to change a table structure or delete it, the tester would
want to ensure that all the Stored Procedures and Views that use that table are compatible
with the particular change. Another example could be that, if the testers want to check for
schema changes between 2 databases, they can do that by using simple queries.
Database table, column testing
Let us look into various checks for database and column testing.
1. Whether the mapping of the database fields and columns in the back end is
compatible with those mappings in the front end.
2. Validation of the length and naming convention of the database fields and columns as
specified by the requirements.
3. Validation of the presence of any unused/unmapped database tables/columns.
4. Validation of the compatibility of the
 data type
 field lengths
of the backend database columns with that of those present in the front end of the
application.
5. Whether the database fields allow the user to provide desired user inputs as required
by the business requirement specification documents.
Keys and indexes testing
Important checks for keys and indexes -
1. Check whether the required
 Primary key
 Foreign Key
constraints have been created on the required tables.
2. Check whether the references for foreign keys are valid.
3. Check whether the data type of the primary key and the corresponding foreign keys
are same in the two tables.
4. Check whether the required naming conventions have been followed for all the keys
and indexes.
5. Check the size and length of the required fields and indexes.
6. Whether the required
 Clustered indexes
 Non Clustered indexes
have been created on the required tables as specified by the business requirements.
Stored procedures testing
The list of the most important things which are to be validated for the stored procedures.
1. Whether the development team did adopt the required
 coding standard conventions
 exception and error handling
for all the stored procedures for all the modules for the application under test.
2. Whether the development team did cover all the conditions/loops by applying the
required input data to the application under test.
3. Whether the development team did properly apply the TRIM operations whenever
data is fetched from the required tables in the Database.
4. Whether the manual execution of the Stored Procedure provides the end user with the
required result
5. Whether the manual execution of the Stored Procedure ensures the table fields are
being updated as required by the application under test.
6. Whether the execution of the Stored Procedures enables the implicit invoking of the
required triggers.
7. Validation of the presence of any unused stored procedures.
8. Validation forAllow Null condition which can be done at the database level.
9. Validation of the fact that all the Stored Procedures and Functions have been
successfully executed when the Database under test is blank.
10. Validation of the overall integration of the stored procedure modules as per as the
requirements of the application under test.
Some of the interesting tools for testing stored procedures are LINQ , SP Test tool etc.
Trigger testing
1. Whether the required coding conventions have been followed during the coding phase
of the Triggers.
2. Check whether the triggers executed for the respective DML transactions have
fulfilled the required conditions.
3. Whether the trigger updates the data correctly once they have been executed.
4. Validation of the required Update/Insert/Delete triggers functionality in the realm of
the application under test.
Database server validations

1. Check the database server configurations as specified by the business requirements.
2. Check the authorization of the required user to perform only those levels of actions
which are required by the application.
3. Check that the database server is able to cater to the needs of maximum allowed
number of user transactions as specified by the business requirement specifications.
Functional database testing
The Functional database testing as specified by the requirement specification needs to ensure
most of those transactions and operations as performed by the end users are consistent with
the requirement specifications.
Following are the basic conditions which need to be observed for database validations.
 Whether the field is mandatory while allowing NULL values on that field.
 Whether the length of each field is of sufficient size?
 Whether all similar fields have same names across tables?
 Whether there are any computed fields present in the Database?
This particular process is the validation of the field mappings from the end user viewpoint. In
this particular scenario the tester would perform an operation at the data base level and then
would navigate to the relevant user interface item to observe and validate whether the proper
field validations have been carried out or not.
The vice versa condition whereby first an operation is carried out by the tester at the user
interface and then the same is validated from the back end is also considered to be a valid
option.
Checking data integrity and consistency
Following checks are important
1. Whether the data is logically well organized
2. Whether the data stored in the tables is correct and as per the business requirements.
3. Whether there are any unnecessary data present in the application under test.
4. Whether the data has been stored as per as the requirement with respect to data which
has been updated from the user interface.
5. Whether the TRIM operations performed on the data before inserting data into the
database under test.
6. Whether the transactions have been performed according to the business requirement
specifications and whether the results are correct or not.
7. Whether the data has been properly committed if the transaction has been successfully
executed as per the business requirements.
8. Whether the data has been roll backed successfully if the transaction has not been
executed successfully by the end user.
9. Whether the data has been roll backed at all in the condition that the transaction has
not been executed successfully and multiple heterogeneous databases have been
involved in the transaction in question.
10. Whether all the transactions have been executed by using the required design
procedures as specified by the system business requirements.

Login and user security
The validations of the login and user security credentials need to take into consideration the
following things.
1. Whether the application prevents the user to proceed further in the application in case
of a
 invalid username but valid password
 valid username but invalid password.
 invalid username and invalid password.
 valid username and a valid password.
2. Whether the user is allowed to perform only those specific operations which are
specified by the business requirements.
3. Whether the data secured from unauthorized access
4. Whether there are different user roles created with different permissions
5. Whether all the users have required levels of access on the specified Database as
required by the business specifications.
6. Check that sensitive data like passwords, credit card numbers are encrypted and not
stored as plain text in database. It is a good practice to ensure all accounts should have
passwords that are complex and not easily guessed.
Non-functional testing
Nonfunctional testing in the context of database testing can be categorized into various
categories as required by the business requirements. These can be load testing, stress testing,
security testing, usability testing, and compatibility testing and so on. The load testing as well
as stress testing which can be grouped under the gamut of performance testing serves two
specific purposes when it comes to the role of nonfunctional testing.
Risk quantification- Quantification of risk actually helps the stakeholders to ascertain the
various system response time requirements under required levels of load. This is the original
intent of any quality assurance task. We need to note that load testing does not mitigate risk
directly, but through the processes of risk identification and of risk quantification, presents
corrective opportunities and an impetus for remediation that will mitigate risk.
Minimum system equipment requirement- The understanding which we observe through
formal testing, the minimum system configuration that will allow the system to meet the
formal stated performance expectations of stakeholders. So that extraneous hardware,
software and the associated cost of ownership can be minimized. This particular requirement
can be categorized as the overall business optimization requirement.
Load testing
The purpose of any load test should be clearly understood and documented.
The following types of configurations are a must for load testing.
1. The most frequently used user transactions have the potential to impact the
performance of all of the other transactions if they are not efficient.
2. At least one non-editing user transaction should be included in the final test suite, so
that performance of such transactions can be differentiated from other more complex
transactions.
3. The more important transactions that facilitate the core objectives of the system
should be included, as failure under load of these transactions has, by definition, the
greatest impact.
4. At least one editable transaction should be included so that performance of such
transactions can be differentiated from other transactions.
5. The observation of the optimum response time under huge number of virtual users for
all the prospective requirements.
6. The observation of the effective times for fetching of various records.
Important load testing tools are load runner, win runner and JMeter.
Stress testing
Stress testing is also sometimes referred to as torturous testing as it stresses the application
under test with enormous loads of work such that the system fails .This helps in identifying
breakdown points of the system.
Important stress testing tools are load runner, win runner and JMeter.
Most common occurring issues during database testing
1. Significant amount of overhead could be involved in order to determine the state of
the database transactions.
2. Solution: The overall process planning and timing should be organized so that no time
and cost based issues appear.
3. New test data have to be designed after cleaning up of the old test data.
4. Solution: A prior plan and methodology for test data generation should be at hand.
5. An SQL generator is required to transform SQL validators in order to ensure the SQL
queries are apt for handling the required database test cases.
6. Solution: Maintenance of the SQL queries and their continuous updating is a
significant part of the overall testing process which should be part of the overall test
strategy.
7. The above mentioned prerequisite ensure that the set-up of the database testing
procedure could be costly as well as time consuming.
8. Solution: There should be a fine balance between quality and overall project schedule
duration.

Myths or Misconceptions related to Database Testing.
1. Database Testing requires plenty of expertise and it is a very tedious job
 Reality: Effective and efficient Database testing provides long-term functional stability
to the overall application thus it is necessary to put in hard work behind it.
2. Database testing adds extra work bottleneck
 Reality: On the contrary, database testing adds more value to the overall work by
finding out hidden issues and thus pro-actively helping to improve the overall
application.
3. Database testing slows down the overall development process
 Reality: Significant amount of database testing helps in the overall improvement of
quality for the database application.
4. Database testing could be excessively costly
 Reality: Any expenditure on database testing is a long-term investment which leads to
long-term stability and robustness of the application. Thus expenditure on database
testing is necessary.
Best Practices
 All data including the metadata as well as the functional data needs to be validated
according to their mapping by the requirement specification documents.
 Verification of the test data which has been created by / in consultation with the
development team needs to be validated.
 Validation of the output data by using both manual as well as automation procedures.
 Deployment of various techniques such as the cause effect graphing technique,
equivalence partitioning technique and boundary-value analysis technique for
generation of required test data conditions.
 The validation rules of referential integrity for the required database tables also need
to be validated.
 The selection of default table values for validation on database consistency is a very
important concept Whether the log events have been successfully added in the
database for all required login events
 Does scheduled jobs execute in timely manner?
 Take timely backup of Database.









ETL/ Datawarehouse Testing : Utlimate Guide
Before we learn ETL Testing, let’s understand –
What is BI?
Business Intelligence is the process of collecting raw data or business data and turning it into
information that is useful and more meaningful. The raw data is the records of the daily
transaction of an organization such as interactions with customers, administration of finance,
and management of employee and so on. These data’s will be used for “Reporting, Analysis,
Data mining, Data quality and Interpretation, Predictive Analysis”.

What is Data Warehouse?
A data warehouse is a database that is designed for query and analysis rather than for
transaction processing. The data warehouse is constructed by integrating the data from
multiple heterogeneous sources.It enables the company or organization to consolidate data
from several sources and separates analysis workload from transaction workload. Data is
turned into high quality information to meet all enterprise reporting requirements for all
levels of users.

What is ETL?
ETL stands for Extract-Transform-Load and it is a process of how data is loaded from the
source system to the data warehouse. Data is extracted from an OLTP database, transformed
to match the data warehouse schema and loaded into the data warehouse database. Many
data warehouses also incorporate data from non-OLTP systems such as text files, legacy
systems and spreadsheets.
Let see how it works
For example, there is a retail store which has different departments like sales, marketing,
logistics etc. Each of them is handling the customer information independently, and the way
they store that data is quite different. The sales department have stored it by customer’s
name, while marketing department by customer id.
Now if they want to check the history of the customer and want to know what the different
products he/she bought owing to different marketing campaigns; it would be very tedious.
The solution is to use a Datawarehouse to store information from different sources in a
uniform structure using ETL. ETL can transform dissimilar data sets into
aunifiedstructure.Later use BI tools to derive meaningful insights and reports from this data.
The following diagram gives you the ROAD MAP of the ETL process

1. Extract
 Extract relevant data
2. Transform
 Transform data to DW (Data Warehouse) format
 Build keys - A key is one or more data attributes that
uniquely identify an entity. Various types of keys are primary key,
alternate key, foreign key, composite key, surrogate key. The
datawarehouse owns these keys and never allows any other entity to
assign them.
 Cleansing of data :After the data is extracted, it will move into the
next phase, of cleaning and conforming of data. Cleaning does the
omission in the data as well as identifying and fixing the
errors. Conforming means resolving the conflicts between those data’s
that is incompatible, so that they can be used in an enterprise data
warehouse. In addition to these, this system creates meta-data that is
used to diagnose source system problems and improves data quality.
3. Load
 Load data into DW ( Data Warehouse)
 Build aggregates - Creating an aggregate is summarizing and storing
data which is available in fact table in order to improve the
performance of end-user queries.

ETL Testing Process
Similar to other Testing Process, ETL also go through different phases. The different phases
of ETL testing process is as follows

ETL testing is performed in five stages
1. Identifying data sources and requirements
2. Data acquisition
3. Implement business logics and dimensional Modelling
4. Build and populate data
5. Build Reports

Types of ETL Testing

Types Of Testing Testing Process
Production Validation Testing
“Table balancing” or “production reconciliation”
this type of ETL testing is done on data as it is being
moved into production systems. To support your
business decision, the data in your production
systems has to be in the correct order. Informatica
Data Validation Option provides the ETL testing
automation and management capabilities to ensure
that production systems are not compromised by the
data.
Source to Target Testing
(Validation Testing)
Such type of testing is carried out to validate whether
the data values transformed are the expected data
values.
Application Upgrades
Such type of ETL testing can be automatically
generated, saving substantial test development time.
This type of testing checks whether the data
extracted from an older application or repository are
exactly same as the data in a repository or new
application.
Metadata Testing
Metadata testing includes testing of data type check,
data length check and index/constraint check.
Data Completeness Testing
To verify that all the expected data is loaded in target
from the source, data completeness testing is done.
Some of the tests that can be run are compare and
validate counts, aggregates and actual data between
the source and target for columns with simple
transformation or no transformation.
Data Accuracy Testing
This testing is done to ensure that the data is
accurately loaded and transformed as expected.
Data Transformation Testing
Testing data transformation is done as in many cases
it cannot be achieved by writing one source SQL
query and comparing the output with the
target. Multiple SQL queries may need to be run for
each row to verify the transformation rules.
Data Quality Testing
Data Quality Tests includes syntax and reference
tests. In order to avoid any error due to date or order
number during business process Data Quality testing
is done.
Syntax Tests: It will report dirty data, based on
invalid characters, character pattern, incorrect upper
or lower case order etc.
Reference Tests: It will check the data according to
the data model. For example: Customer ID
Data quality testing includes number check, date
check, precision check, data check , null check etc.
Incremental ETL testing
This testing is done to check the data integrity of old
and new data with the addition of new
data. Incremental testing verifies that the inserts and
updates are getting processed as expected during
incremental ETL process.
GUI/Navigation Testing
This testing is done to check the navigation or GUI
aspects of the front end reports.
How to create ETL Test Case
ETL testing is a concept which can be applied to different tools and databases in information
management industry. The objective of ETL testing is to assure that the data that has
been loaded from a source to destination after business transformation is accurate. It
also involves the verification of data at various middle stages that are being used between
source and destination.

While performing ETL testing, two documents that will always be used by an ETL tester are
1. ETL mapping sheets :An ETL mapping sheets contain all the
information of source and destination tables including each and every
column and their look-up in reference tables. An ETL testers need to be
comfortable with SQL queries as ETL testing may involve writing big
queries with multiple joins to validate data at any stage of ETL. ETL
mapping sheets provide a significant help while writing queries for data
verification.
2. DB Schema of Source, Target: It should be kept handy to verify any
detail in mapping sheets.

ETL Test Scenarios and Test Cases

Test Scenario Test Cases
Mapping doc validation
Verify mapping doc whether corresponding ETL
information is provided or not. Change log should
maintain in every mapping doc.
Validation
1. Validate the source and target table structure
against corresponding mapping doc.
2. Source data type and target data type should
be same
3. Length of data types in both source and target
should be equal
4. Verify that data field types and formats are
specified
5. Source data type length should not less than
the target data type length
6. Validate the name of columns in the table
against mapping doc.
Constraint Validation
Ensure the constraints are defined for specific table as
expected
Data consistency issues

1. The data type and length for a particular
attribute may vary in files or tables though the
semantic definition is the same.
2. Misuse of integrity constraints
Completeness Issues
1. Ensure that all expected data is loaded into
target table.
2. Compare record counts between source and
target.
3. Check for any rejected records
4. Check data should not be truncated in the
column of target tables
5. Check boundary value analysis
6. Compares unique values of key fields between
data loaded to WH and source data
Correctness Issues
1. Data that is misspelled or inaccurately
recorded
2. Null, non-unique or out of range data
Transformation Transformation
Data Quality
1. Number check: Need to number check and
validate it
2. Date Check: They have to follow date format
and it should be same across all records
3. Precision Check
4. Data check
5. Null check
Null Validate
Verify the null values, where “Not Null” specified for
a specific column.
Duplicate Check
1. Needs to validate the unique key, primary key
and any other column should be unique as per
the business requirements are having any
duplicate rows
2. Check if any duplicate values exist in any
column which is extracting from multiple
columns in source and combining into one
column
3. As per the client requirements, needs to be
ensure that no duplicates in combination of
multiple columns within target only
Date Validation
Date values are using many areas in ETL
development for
1. To know the row creation date
2. Identify active records as per the ETL
development perspective
3. Identify active records as per the business
requirements perspective
4. Sometimes based on the date values the
updates and inserts are generated.
Complete Data Validation
1. To validate the complete data set in source and
target table minus a query in a best solution
2. We need to source minus target and target
minus source
3. If minus query returns any value those should
be considered as mismatching rows
4. Needs to matching rows among source and
target using intersect statement
5. The count returned by intersect should match
with individual counts of source and target
tables
6. If minus query returns of rows and count
intersect is less than source count or target
table then we can consider as duplicate rows
are existed.
Data Cleanness
Unnecessary columns should be deleted before
loading into the staging area.
Types of ETL Bugs

Type of Bugs Description
User interface
bugs/cosmetic bugs
 Related to GUI of application
 Font style, font size, colors, alignment, spelling
mistakes, navigation and so on
Boundary Value
Analysis (BVA) related
bug
 Minimum and maximum values
Equivalence Class
Partitioning (ECP)
related bug
 Valid and invalid type
Input/Output bugs
 Valid values not accepted
 Invalid values accepted
Calculation bugs
 Mathematical errors
 Final output is wrong
Load Condition bugs
 Does not allows multiple users
 Does not allows customer expected load
Race Condition bugs
 System crash & hang
 System cannot run client platforms
Version control bugs
 No logo matching
 No version information available
 This occurs usually in regression testing
H/W bugs
 Device is not responding to the application
Help Source bugs
 Mistakes in help documents

Difference between Database testing and ETL testing

ETL Testing Data Base Testing
Verifies whether data is moved as expected
The primary goal is to check if the data is
following the rules/ standards defined in the
Data Model
Verifies whether counts in the source and
target are matching
Verifies whether the data transformed is as
per expectation
Verify that there are no orphan records and
foreign-primary key relations are maintained
Verifies that the foreign primary key relations
are preserved during the ETL
Verifies that there are no redundant tables and
database is optimally normalized
Verifies for duplication in loaded data
Verify if data is missing in columns where
required

Responsibilities of an ETL tester
Key responsibilities of an ETL tester are segregated into three categories
 Stage table/ SFS or MFS
 Business transformation logic applied
 Target table loading from stage file or table after applying atransformation.
Some of the responsibilities of an ETL tester are
 Test ETL software
 Test components of ETL datawarehouse
 Execute backend data-driven test
 Create, design and execute test cases, test plans and test harness
 Identify the problem and provide solutions for potential issues
 Approve requirements and design specifications
 Data transfers and Test flat file
 Writing SQL queries3 for various scenarios like count test
ETL Performance Testing and Tuning
ETL performance testingis a confirmation test to ensure that an ETL system can handle the
load of multiple users and transactions. The goal of performance tuning is to optimize
session performance by eliminating performance bottlenecks. To tune or improve the
performance of the session, you have to identify performance bottlenecks and eliminate it.
Performance bottlenecks can be found in source and target databases, the mapping, the
session and the system. One of the best tools used for performance testing is Informatica.
Automation of ETL Testing
The general methodology of ETL testing is to use SQL scripting or do “eyeballing” of data..
These approaches to ETL testing are time-consuming, error-prone and seldom provide
complete test coverage. To accelerate, improve coverage, reduce costs, improve defect
detection ration of ETL testing in production and development environments, automation is
the need of the hour. One such tool is Informatica.
Best Practices for ETL Testing
1. Make sure data is transformed correctly
2. Without any data loss and truncation projected data should be loaded into the data
warehouse
3. Ensure that ETL application appropriately rejects and replaces with default values
and reports invalid data
4. Need to ensure that the data loaded in data warehouse within prescribed and expected
time frames to confirm scalability and performance
5. All methods should have appropriate unit tests regardless of visibility
6. To measure their effectiveness all unit tests should use appropriate coverage
techniques
7. Strive for one assertion per test case
8. Create unit tests that target exceptions



Sign up to vote on this title
UsefulNot useful