You are on page 1of 5

With the evolution and rapid adoption of the cloud technologies many enterprises and developers

are moving towards cloud based solution offerings. This is also true for many existing On-Premise
Data Warehouses and moving them to cloud based Data Warehouses.

Now this paradigm shift brings new challenges as well. There are challenges related to tools’
choice and some are related to approach to achieve the objective. Considering technology
advancements there are many tools available in the market offering similar Data Warehousing
capabilities as those available On-Premises.

Depending on the size of the data, transformation requirements and information accessibility,
architects and developers can fairly choose the right tool satisfying respective needs. Challenges
with the tools are relatively easy to overcome but it is more difficult and important to decide on
the approach to be considered for this Shift; and one of the trivial factor to consider is whether to
stick to traditional, proven ETL (Extract, Transform, and Load) approach or to shift to ELT
(Extract, Load, Transform)?

These challenges are also applicable to non-cloud solutions.

Before deep dive into ETL vs ELT here is brief on what is Extract, Transform and Load -

Extract

This is a process where data is collected from various homogeneous or heterogeneous data
sources. This process integrates data available in multiple formats from various applications,
databases, files etc. developed and maintained by various applications developers / vendors and
available on different locations and hardware.

Transform

These are more in terms of rules and functions which needs to be applied to massage the data and
make it ready for the targeted Database / Data Warehouse.

Load

Once the data is transformed as per operational needs, it is loaded into the end target from where
it is consumed by the end users either using operational systems or business intelligence tools.

ETL vs ELT

Having above understanding in place, it is fairly easy now to understand the difference in ETL
and ELT.

During the ETL approach the data is extracted, then transformed and ultimately loaded into the
target Database / Data Warehouse.

Where as in ELT approach, the data is extracted as is and loaded into the target Database / Data
Warehouse and then necessary transformations takes place preparing the information for
downstream uses.
Features to consider while choosing your approach – ETL or ELT

There are many factors which needs to be considered and well thought while choosing the
approach to fit the transformation needs. On a very high level below are some primary factors -

1. Data Volume – What is the amount of the data which is required to be moved from On-
Premise Data Ware house to Cloud based Data Warehouse?

2. Frequency of Data Load – Along with data volume, is it important to identify the frequency
of data load from source to destination. Is it near real time, stale data or onetime load? Most of
the transactional systems are putting new data or modifying the existing data leading towards
the need for change data capture. This adds complexity in identifying the delta along with slowly
moving dimensions

3. Sources and Destinations– Which are various data source to be connected to fetch he data
and which are the targeted databases / data stores which will accommodate the data as new store.
There are some databases which are extremely good at ELT approach where as some are relative
better during ETL. Some systems have native support to the source data where as some needs
additional connectors adding cost to the solution.
4. Availability of Tools – What are the various tools available / to be considered to transform
the data from On-Premise Data Warehouse to Cloud based Data Warehouse

5. Transformation responsibility – There is a need to give significant consideration for when


the transformations need to be accomplished and who is responsible to perform these
transformations. Whether these needs to be taken care by the tool used during the ETL process
or the transformations needs to be performed by the targeted Database / Data Warehouse once
the data is loaded there.

6. Flexibility to adopt new business rules – Traditional ETL processes do not consider the
business processes. They operate more on data movement from source to destination. As
businesses grow, their operating processes undergo transformations and that leads to change in
the ETL as well. Currently, ETL architectures are not that flexible to accommodate such
changes.

7. Scalability – Considering the dramatic growth in the data volume, it is increasingly difficult
for the IT systems to scale up quickly to accommodate data storage and processing needs. While
selecting the approach, this is important to consider the future data growth and available
technology landscape

8. Data Security – This is one of the very important aspect to consider during the approach
selection. Some of the data you literally do not want to get exposed to outside world and will
have reservations on where and how it should be stored

9. License fees – With the modern RDBMS systems available in the market, which are capable
of taking transformation responsibility, a need to have a separate ETL tool is no longer needed.

10. Available knowledge base and training needs – With the advancement in the technology
there is a great need to have through understanding and implementation expertise of the modern
RDBMS capable of taking care of ETL / ELT needs. Availability of such talent also participate
in the decision making.

With the advancements in the modern tools and technologies, availability of data and necessity to
make the data available for end users at near real time, many architects are experimenting hybrid
approaches like TELT, ETLT or even TETLT. These hybrid approaches help bridging gaps in
ETL or ELT but they also add additional burden on the RDBMS with added complexity.