You are on page 1of 3

30/03/2018 Basics of Flat File (CSV, Delimited) testing

 Contact Us (/contact-us)  Webinars (/webinars)  About Us (/about-us)  contact@datagaps.com (mailto:contact@datagaps.com)  (https://twitter.com/datagaps)


 (https://www.linkedin.com/company/datagaps)  (http://www.youtube.com/datagaps)

FLAT FILE TESTING


Testing of delimited les such as CSV les or xed width at les.

What are Flat Files?


Flat les are extensively used for exchanging data between enterprises or between organizations within an
enterprise. Flat les come in two forms - delimited les such as CSV (comma separated) les or xed width
les.

What is Flat File Testing?


Flat File testing is the process of validating the quality of data in the at le as well as ensuring that the data
in the at le has been consumed appropriately by the application or ETL process.

Challenges in Flat File Testing?


Testing of inbound at les presents unique challenges because the producer of the at le is usually
di erent organizations within an enterprise or an external vendor. Consequently, there might be di erences
in the format and content of the les since there is no easy way to enforce the data type and data quality
constraints on the data in the at les. Issues in at le data can cause failures in the consuming process.
While the le processing requirements are di erent from project to project, the focus of this use case is to list
out some of the common checks that need to be performed for validating at les.

 
Flat File Testing Categories
 File Ingestion Testing

 Data Type Testing

 Data Quality Testing

 Data Completeness Testing

 Data Transformation Testing

 Performance Testing

FLAT FILE INGESTION TESTING

http://www.datagaps.com/concepts/flat-file-testing 1/3
30/03/2018 Basics of Flat File (CSV, Delimited) testing

When data is moved using at les between enterprises or organizations within enterprise, it is important to perform a set of le ingestion
validations on the inbound at les before consuming the data in those les.

File name validation

Files are ftp'ed or copied over to a speci c folder for processing. These les usually have a speci c naming convention so that the process consuming the le is able
to understand the contents and date. From a testing standpoint, the le name pattern needs to be validated to verify that it meets the requirement.

Example: A government agency that gets les from multiple vendors on a periodic basis. The arriving les should follow a naming convension of
'CompanyCode_ContentType_DateTimestamp.csv'. However, the les coming in from a speci c vendor do not have have the correct company name.

Size and Format of the at les

Although, at les are generally delimited or xed width, it is common to have a header and footer in these les. Sometimes, these headers have a rowcount that
can be used to verify that the le contains the entire data as expected.

Some of the relevant checks are:


   Verify that the size of the le is within the expected range where applicable.
   Verify that the header, footer and column heading rows have the expected format and have the expected location within the at le.
   Perform any row count checks to cross check the data in the header with the values in the delimited data.

Example: A nancial reporting company generates les with a header that contains the summary amount with the line items having the detailed split. The sum of the
amounts in the line items should match the summary amount in the header.

File arrival, processing and deletion times

Files arrive periodically into a speci c network folder or an ftp location before getting consumed by a process. Usually, there are speci c requirements that need to
be met regarding the le arrival time, order of arrival and retaining them.

Example: A pharma company gets a set of les from a vendor on a daily basis. The process consuming this les expects the complete set of les to be available before
processing
1. A le that were supposed to come yesterday was delayed. It came in sometime after today's le arrived causing issues due to di erence in the order of processing the les.
2. After the les gets processed, it is supposed to be moved to a speci c directory where it is to be retained for a speci ed period of time and deleted. However, the le did
not get copied over.
3. Compare the transformed data in the target table with the expected values for the test data.

Automate le ingestion testing using ETL Validator


ETL Validator comes with Component Test Case and File Watcher which can be used to test Flat Files.
   Flat File Component: Flat le component is part of the Component Test Case. It can be used to de ne data type and data quality rules on the incoming at
le. The data in the at le can also be compared with data from the database.
   File Watcher: Using File Watcher test plans can be triggered automatically when a new le comes into a directory so that the test cases on the le can be
executed automatically before the les are used further by the consuming process.
   SFTP Connection: Makes it easy compare and validate at les located in a remote SFTP location.

ETL VALIDATOR RESOURCES


Try ETL Validator free for 30 days or contact us for demo

30-DAY TRIAL

http://www.datagaps.com/concepts/flat-file-testing 2/3
30/03/2018 Basics of Flat File (CSV, Delimited) testing

(/etl-testing-tools/etl-validator-download)

VIDEOS
(https://www.youtube.com/playlist?list=PLq-Q4hhL4wuBFR-Gbs67QBJe5uef4jbpm)

SUPPORT
(http://datagaps.freshdesk.com/support/solutions/1000122619)

USE CASES
(http://datagaps.freshdesk.com/support/solutions/folders/1000197456)

DATASHEET
(/images/datasheets/ETL_Validator.pdf)

FAQ
(/etl-testing-tools/etl-validator/faq)

Interested in more information?


JOIN A WEBINAR   (/WEBINARS) CONTACT US   (/CONTACT-US)

Copyright © 2016 datagaps inc. All rights reserved.  

http://www.datagaps.com/concepts/flat-file-testing 3/3

You might also like