You are on page 1of 16

Teradata to Snowflake

Migration Guide
About Bigdata Dimension Labs
Bigdata Dimension Labs was built on the ideal that every business should be
positioned to make more profitable, data-driven decisions. To achieve this, they
have partnered with the best modern data sharing technologies in the world and
are recognized as centers of excellence for:

Bigdata Dimension Labs have empowered organizations such as 3M, Agile and
Bill Gates Foundation with 360° performance overview, real-time analytics for
better decision-making, standardized reporting and faster data access and
processing.

Imagine what they can do for you.

Find out more at BDDLabs.com

Teradata to Snowflake Migration Guide 2


Content
4. Why migrate?

6. Strategy—thinking about your migration

8. Migrating your existing Netezza warehouse

12. Need help migrating?

13. Appendix A—Your Ultimate Migration Partner

14. Appendix B—Migration preparation checklist

15. Appendix C—Data type conversion table

16. Appendix D—SQL considerations

Teradata to Snowflake Migration Guide 3


Why migrate?
Decades ago, Teradata identified
the need to manage and analyze
large volumes of data. But just as YOUR MOTIVATION TO MIGRATE
the volume, velocity and variety of Some of the key reasons enterprises
data has since changed, the cloud migrate off of Teradata:
has enabled what’s possible today
with modern data analytics. For 1. Legacy platform
example, by separating compute Traditional technology fails to meet
from storage, Snowflake has devel- the needs of today’s business users,
oped a modern cloud data ware- such as the increasing requirement
for unlimited concurrency and
house solution that automatically
performance.
and instantly scales in a way not
possible with Teradata, whether the 2. Cloud
current Teradata system is on-prem- A strategy to move from on-premise
ises or hosted in the cloud, Snow- to cloud implies a move away from
flake accomplishes this with its traditional IT delivery models to a
multi-cluster, shared data architec- more 'on-demand,' 'as-a-service'
ture. model with minimal management
intervention.

3. New data sources and workloads


Key data sources for the modern
enterprise are already in the cloud.
The cloud also allows for new types
of analytics to be assessed and
refined without a long-term commit-
ment to infrastructure or specific
tools.
4. Cost
Snowflake allows true pay-as-you-go
cloud scalability without the need for
complex reconfiguration as your data
or workloads grow.

Teradata to Snowflake Migration Guide 4


WHY SNOWFLAKE?
Snowflake’s innovations break down the technology and architecture barri-
ers that organizations still experience with other data warehouse vendors.
Only Snowflake has achieved all six of the defining qualities of a data ware-
house built for the cloud:

ZERO MANAGEMENT ALL OF YOUR DATA

Snowflake reduces complexity with built-in Create a single source of truth to easily store,
performance so there’s no infrastructure integrate and extract critical insight from
to tweak, no knobs to turn and no tuning petabytes of structured and semi-structured
required. data (JSON, XML, AVRO).

ALL OF YOUR USERS PAY BY THE SECOND

Provide access to an architecturally Snowflake’s built-for-the-cloud solution scales


unlimited number of concurrent users storage separate from compute, up and down,
and applications without eroding transparently and automatically.
performance.

DATA SHARING SQL COMPLETE SQL DATABASE

Snowflake extends the data warehouse to the Snowflake supports the tools millions of
Data Sharehouse™, with direct, governed and business users already know how to use today.
secure data sharing within seconds, so enter-
prises can easily forge one-to-one,
one-to-many and many-tomany data sharing
relationships.

THE CORE OF SNOWFLAKE


Snowflake delivers the performance, concurrency and simplicity needed to store
and analyze all data available to an organization in one location. Snowflake’s
technology combines the power of data warehousing, the flexibility of big data
platforms, the elasticity of the cloud and live data sharing at a fraction of the cost of
traditional solutions.

Teradata to Snowflake Migration Guide 5


Strategy—thinking about
your migration
WHAT SHOULD YOU CONSIDER?
There are several things to contemplate when choosing
your migration path. It’s usually desirable to pilot the migra-
tion on a subset of the data and processes. Organizations
often prefer to migrate in stages, reducing risk and showing
value sooner. However, you must balance this against the
need to keep the program momentum and minimize the
period of dual-running. In addition, your approach may also
be constrained by the interrelationships within the data,
such as data marts that rely on references to data populat-
ed via a separate process in another schema.

Questions to ask about your


workloads and data
• What workloads and processes can be migrated with
minimal effort?

• Which processes have issues today and would benefit


from re-engineering?

• What workloads are outdated and require a complete • Or, your current system uses welldesigned data and process-
es using standard ANSI SQL.
overhaul?

• What new workloads would you like to add that would • Or, you have timescale pressures to move off legacy equip-
ment
deploy easier in Snowflake?

Factors that may lead to a staged approach:


‘Lift and shift’ vs a staged approach • Your warehouse platform consists of many independent data
The decision whether to move data and processes in one marts and other data applications, which can be moved
bulk operation or deploy a staged approach depends on independently over time.
several factors. They include the nature of your current
data analytics platform, the types and number of data • Or, you have critical data and processes within your data
sources and your future ambitions and timescales. warehouse that no longer perform well and require re-engi-
neering.

Factors that lead to a lift-and-shift approach may • Or, you’ve chosen to focus on new business requirements
focus on what you already have: rather than reworking legacy processes.

• Or, you want to change your data ecosystem, such as adding


• You have highly integrated data across the
a new ELT or other data ingestion tool, or move to new
existing warehouse. business intelligence (BI) and visualization tools.

• Or, you are migrating a single independent,


standalone data mart.

Teradata to Snowflake Migration Guide 6


WHAT YOU DON’T NEED TO
WORRY ABOUT
When migrating to Snowflake from Teradata there are a
number of concerns that you can forget about as they are
not relevant to your Snowflake environment.

Data distribution and primary indexes


In Snowflake, there is no need for primary indexes. Since
compute is separate from storage in Snowflake’s architec-
ture, the data is not pre-distributed to the MPP compute
Capacity planning
nodes. In Snowflake, we have MPP compute nodes that do With Snowflake, you pay for only what you use. Snowflake is
not rely on the data being distributed ahead of time. a SaaS product but further enhanced for efficiency with
persecond, usage-based pricing. Under this model, Snowflake
Since Snowflake’s data is not predistributed, it can scale to also offers the option for further cost reductions for custom-
more parallel compute nodes instantly. With Teradata, the ers who want to pre-purchase usage. On the flip side, with
data would have to run a reconfig, with new AMPs/nodes capacity planning for an on-premises Teradata system, you
added and new hashmaps created before the data from the run a risk of over or under configuring your system. Even with
physical table could be re-distributed to the new AMPs. Teradata IntelliCloud, you have the similar capacity planning
This is a process that requires significant planning and risk as compute and storage are fixed per instance. If you
resources, which impacts performance but is not necessary need more capacity you must buy in predefined increments.
with Snowflake. With Snowflake’s elastic storage and compute architecture,
you never have this risk, so you can save money and avoid the
Indexing and query optimization time previously spent on extensive planning.

Snowflake has an optimizer built from the ground up and Disaster recovery
architected for MPP and the cloud. Snowflake understands
Teradata has several disaster recovery scenarios. Many of
parallel execution plans and automatically optimizes them,
them require the purchase of another system, and purchasing
relieving our customers of this task. This means there are
software such as Unity to implement these scenarios. With
no indexes in Snowflake so you don’t have to migrate your
Snowflake, none of this is necessary. Snowflake leverages
USIs or NUSIs.
many of the built-in features of the cloud, such as the auto-
Workload management matic replication of data built into AWS. Snowflake is auto-
matically synced across multiple AWS availability zones, with
Workload management is unnecessary in a Snowflake more zones and other cloud computing platforms to come
environment due to its multicluster architecture, which available on a consistent basis. There is no work on your part
allows you to create separate virtual warehouses for your to establish this.
disparate workloads so as to avoid resource contention
completely. Separate dev/test environment
With Teradata, to perform development and testing, you need
Statistics collection additional servers, which means an additional cash outlay just for
Snowflake automatically captures statistics, relieving DBAs the hardware plus the time to configure the hardware. Not so
from having to set up jobs to collect statistics for perfor- with Snowflake as you can simply create another database in your
mance tuning. It’s automatic in Snowflake, so you no longer account and set it up for any purpose you need, such as dev or
have to remember to add new tables to the process when test. In addition, with Snowflake’s zero-copy clone feature, you
can instantly populate those databases with complete copies of
your data warehouse grows.
production data for no additional cost. With Teradata, you have to
endure the painstaking process of exporting your production
data from one system to import it to your dev or test server.

Teradata to Snowflake Migration Guide 7


Migrating your existing
Teradata warehouse
In order to successfully migrate your as the output scripting dialect rather than
enterprise data warehouse to Snow- Teradata. Keep in mind, Snowflake is
flake, you need to develop and follow self-tuning and has a unique architecture.
a logical plan that includes the items You won’t need to generate code for any
in this section. indexes, partitions or storage clauses of
any kind that you may have needed in
MOVING YOUR DATA MODEL Teradata. You only need basic DDL, such
as CREATE TABLE, CREATE VIEW, and
As a starting point for your migration,
CREATE SEQUENCE. Once you have
you’ll need to move your database
these scripts, you can log into your Snow-
objects from Teradata to Snowflake.
flake account to execute them.
This includes the databases, tables,
views and sequences in your existing
If you have a data modeling tool, but the
data warehouse that you want to
model is not current, we recommend you
move over to your new Snowflake
reverse engineer the current design into
data warehouse. In addition, you may
your tool, then follow the approach
want to include all of your user
outlined above.
account names, roles and objects
grants. At a minimum, the user who Using existing DDL scripts
owns the Teradata database must be
created on the target Snowflake You can begin with your existing
system before migrating data. DDL scripts if you don’t have a
data modeling tool. But you’ll need
Which objects you decide to move the most recent version of the DDL
will be highly dependent on the scope
scripts (in a version control system).
of your initial migration. There are
several options for making this You’ll also want to edit these
happen. The following sections scripts to remove code for extrane-
outline three possible approaches for ous features and options not
moving your data model from Terada- needed in Snowflake, such as
ta to Snowflake.
primary indexes and other storage
Using a data modeling tool or distribution related clauses.
If you have stored your data ware- Depending on the data types you
house design in a data modeling tool, used in Teradata, you may also
you can generate the DDL needed to need to do a search-and-replace in
rebuild these objects. Since Snowflake the scripts to change some of the
uses standard SQL, you simply need data types to Snowflake optimized
to pick ANSI SQL
types. For a list of these data types,
please see Appendix B.

Teradata to Snowflake Migration Guide 8


Creating new DDL scripts
If you don’t have current DDL scripts for you data ware-
house, or a data modeling tool, you will need to extract
the metadata needed from the Teradata data dictionary
in order to generate these scripts. But for Snowflake,
this task is simpler since you won’t need to extract meta-
data for indexes and storage clauses.

As mentioned above, depending on the data types in


your Teradata design, you may also need to change
some of the data types to Snowflake optimized types.
You will likely need to write a SQL extract script of some
sort to build the DDL scripts. Rather than do a search
and replace after the script is generated, you can code
these data type conversions directly into the metadata
extract script. The benefit is that you have automated
the extract process so you can do the move iteratively.
Plus, you will save time editing the script after the fact.
Additionally, coding the conversions into the script is
less error-prone than any manual cleanup process, espe-
cially if you are migrating hundreds or even thousands of
tables.

MOVING YOUR EXISTING DATA SET

Once you have built your objects in Snowflake, you'll . If you choose to move your data manually, you will need to
want to move the historical data already loaded in your extract the data for each table to one or more delimited flat
Teradata system over to Snowflake. To do this, you can files in text format using Teradata Parallel Transporter (TPT).
use a third-party migration tool (see Appendix A), an ETL
(extract, transform, load) tool or a manual process to . Then upload these files using the PUT command into an
move the historical data. Choosing among these Amazon S3 staging bucket, either internal or external. We
options, you should consider how much data you have recommend these files be between 100MB and 1GB to take
to move. For example, to move 10s or 100s of advantage of Snowflake’s parallel bulk loading.
terabytes, or even a few petabytes of data, a practical
approach may be to extract the data to files and move it . After you have extracted the data and moved it to S3, you can
via a service such as AWS Snowball. If you have to move begin loading the data into your table in Snowflake using the
100s of petabytes or even exabytes of data, AWS Snow- COPY command. You can check out more details about our
mobile is likely the more appropriate option. COPY command in our online documentation.

Teradata to Snowflake Migration Guide 9


MIGRATING YOUR QUERIES
AND WORKLOADS

Data query migration


Since Snowflake uses ANSI-compliant SQL, most of your
existing queries will execute on Snowflake without requiring
change. However, Teradata does use some Teradata-specific
extensions, so there are a few constructs to watch out for.
Some examples include the use of LIKE ANY and QUALIFY.
See Appendix C for details and suggested translation.

Migrating BI tools
Many of your queries and reports are likely to use an exist-
ing business intelligence (BI) tool. Therefore, you’ll need to
account for migrating those connections from Teradata to
Snowflake. You’ll also have to test those queries and reports
to be sure you’re getting the expected results.

This should not be too difficult since Snowflake supports


standard ODBC and JDBC connectivity, which most
modern BI tools use. Many of the mainstream tools have
native connectors to Snowflake. Don’t worry if your tool of
• Is there a specific time period in which this workload needs
choice is not available. You should be able to establish a
to complete? Between certain hours? You can easily sched-
connection using either ODBC or JDBC. If you have ques-
ule any Snowflake virtual warehouse to turn on and off, or
tions about a specific tool, your Snowflake contact will be
just auto suspend and automatically resume when needed.
happy to help.
• How much compute will you need to meet that window?
Use that to determine the appropriately sized virtual ware-
Handling workload management house.
As stated earlier, the workload management required in
Teradata is unnecessary with Snowflake. The multicluster • How many concurrent connections will this workload need?
If you normally experience bottlenecks you may want to use
architecture of Snowflake allows you to create separate
the Snowflake multi-cluster warehouse for those use cases
virtual warehouses for your disparate workloads to avoid
to allow automatic scale out during peak workloads.
resource contention completely. Your workload manage-
ment settings in Teradata (TASM or TIWM) will give you a
• Think about dedicating at least one large virtual warehouse
good idea of how you’ll want to set up Snowflake virtual
for tactical, high-SLA workloads.
warehouses. However, you’ll need to consider the optimal
way to distribute these in Snowflake. As a starting point,
create a separate virtual warehouse for each workload. You • If you discover a new workload, you can easily add it on
demand with Snowflake’s ability to instantly provision a
will need to size the virtual warehouse according to resourc-
new virtual warehouse.
es required to meet the SLA for that workload. To do so,
consider the following:

Teradata to Snowflake Migration Guide 10


MOVING THE DATA PIPELINE AND ETL CUT OVER
PROCESSES Once you migrate your data model, your data, your loads and
Snowflake is optimized for an ELT (extract, load, trans- your reporting over to Snowflake, you must plan your switch
form) approach. However, we support a number of tradi- from Teradata to Snowflake. Here are the fundamental steps:
tional ETL (extract, transform, load) and data integration
1. Execute a historic, one-time load to move all the existing data.
solutions. We recommend a basic migration of all existing
data pipelines and ETL processes in order to minimize the
2. Set up ongoing, incremental loads to collect new data.
impact to your project unless you are planning to signifi-
cantly enhance or modify them. Given the fact that
3. Communicate the cut-over to all Teradata users, so they know
testing and data validation are key elements of any
what’s changing and what they should expect.
changes to the data pipeline, maintaining these process-
es, as is, will reduce the need for extensive validation.
4. Ensure all development code is checked in / backed up, which
is a good development practice.
Snowflake has worked diligently to ensure that the migra-
tion of processes running on traditional ETL platforms is
5. Point production BI reports to pull data from Snowflake.
as painless as possible. Native connectors for tools such
as Talend and Informatica make the process quick and
6. Run Snowflake and Teradata in parallel for a few days and
easy.
perform verifications.
We recommend running the data pipeline in both Snow-
7. Turn off the data pipeline and access to Teradata for the
flake and Teradata during the initial migration. This way,
affected users and BI tools.
you can simplify the validation process by enabling a
quick comparison of the results from the two systems.
Once you’re sure queries running against Snowflake are
producing identical results as queries from Teradata, you
can be confident that the migration did not affect data
quality. But you should see a dramatic improvement in
the performance.

For data pipelines that require reengineering, you can


leverage Snowflake’s scalable compute and bulk-loading
capabilities to modernize your processes and increase
efficiency. You may consider taking advantage of our
Snowpipe tool for loading data continuously as it arrives
to your cloud storage provider of choice, without any
resource contention or impact to performance thanks to
Snowflake’s cloud-built architecture. Snowflake makes it
easy to bring in large datasets and perform transforma-
tions at any scale.

Teradata to Snowflake Migration Guide 11


Need help migrating?
Bddlabs experts are available to accelerate your migration, structure and optimize your planning and implementation activi-
ties; and apply customer best practices to meet your technology and business objectives. Bddlabs Engagement, Delivery and
Advisory Services Team deploys a powerful combination of data architecture expertise and advanced technical knowledge of
the platform to deliver high performing data strategies, proof of concepts and migration projects.

We have also built ingestion tool to automate and accelerate the migration process.
Whether your organization is fully staffed for a platform migration or you need additional expertise, Bddlabs have the skills
and tools to accelerate your journey to cloud-built data analytics, so you can reap the full benefits of Snowflake quickly. To
find out more, please contact the Bddlabs sales team at

If you still want to learn more, I encourage you to visit BDDLabs.com to gain more information
about modern cloud data sharing. You can also access documentation, view webinars, browse our
offers, view scoops of upcoming events, and get support. We also invite you to schedule a free
demo of our data sharehousing technology so your business can get started right away!

Email: solutions@bddlabs.com
Phone: 888-856-2238

Teradata to Snowflake Migration Guide 12


Appendix A—Your Ultimate
Migration Partner
Data is the fuel of successful business and as the days of costly hardware and outdated infrastructure are over so, the cloud has
brought the innovation the way we use, manage and store the data. Cloud computing empowers enterprises with scalable, cost
effective and innovative solutions. To fully utilize the power of the cloud capabilities, you need framework and solutions
designed for the modern computing landscape, built to handle modernization, challenges and uncover the insights.

At Bigdata Dimension Labs we make those frameworks and solutions. Our cloud-powered framework and solution enables
enterprises to become data driven, increase performance and provide the path for automagic modernization. Our solutions also
monetize the data you share externally (DaaS) and make the best use of the data others share with you - all with our flexible and
affordable cloud solutions.

Our offerings include below platforms and solutions;

The ultimate Snowflake cloud data warehouse companion Application modernization platform which is out of the
which uncovers the insights of the snowflake platform on box cloud native ready for Azure, AWS or Google – Built
Azure and AWS. with ageless application modernization framework for
the cloud that you will never have to code it again.

Application modernization platform which is out of the box Modern Talend ingestion & code standard platform to
cloud native ready for Azure, AWS or Google – Built with help you migrate from Teradata, Netezza, have a common
ageless application modernization framework for the cloud API ingestion engine.
that you will never have to code it again.

Modern Talend job scan platform that validate the code Modern Talend job scan platform that validate the code
standards and accelerate deployment standards and accelerate deployment Modern self
service providers and consumer’s data sharing platform
that eliminates the need to write API, FTP, Email, ETL and
manual processes to share the data.

Teradata to Snowflake Migration Guide 13


Appendix B—Migration preparation
checklist
DOCUMENT THE EXISTING SOLUTION PRIORITIZE DATASETS FOR MIGRATION
• List of databases that need to be migrated • List of data sets to migrate first

• List of database objects that need to be migrated • Method for identifying process dependencies for data sets

• List of processes and tools that populate and pull data • Documentation of process dependencies for data sets
from the existing solution
IDENTIFY THE MIGRATION TEAM
• List of security roles, users and permissions • List of migration team members and roles

• List of Snowflake accounts that exist or need to be creat- • Contact information for all team members
ed
DEFINE THE MIGRATION DEADLINES
• Frequency of security provisioning processes AND BUDGET
• Document the existing solution into an as-is architecture • List of business expectations for the migration deadline
diagram
• Documented budget allocated for the migration project
DETERMINE A MIGRATION APPROACH
• List of processes to migrate as is • Completed estimation template for Snowflake virtual
warehouses
• List of processes that need re-engineering
DETERMINE THE MIGRATION OUTCOMES
• List of processes that need to be fixed
• List of high-level desired outcomes and assumptions at the
• Draft of migration deliverables completion of the migration

• To-be architecture diagram • Documented plan for communicating the migration project
wins to stakeholders
CAPTURE THE DEVELOPMENT AND
DEPLOYMENT PROCESSES

• List of tools that will be introduced with the migration

• List of tools that will be deprecated after the migration

• List of development environments needed for the migration

• List of deployment processes used for the migration

Teradata to Snowflake Migration Guide 14


Appendix C—Data type conversion table
This appendix contains a sample of some of the data type
mappings you need to know when moving from Teradata to Snow-
flake. As you will see, many are the same but a few will need to be
changed.

T ERADATA DATA T Y PE NOTE S SNOWF LAKE DATA TY P E NOTES

BYTEINT BYTEINT
SMALLINT SMALLINT
INTEGER INTEGER
BIGINT BIGINT
DECIMAL DECIMAL
FLOAT FLOAT
NUMERIC NUMERIC
CHAR Up to 64K CHAR Up to 16MB
VARCHAR Up to 64K VARCHAR Up to 16MB
LONG VARCHAR Up to 64K VARCHAR Up to 16MB
CHAR VARYING(n) CHAR VARYING(n)
REAL REAL
DATE DATE
TIME TIME
TIMESTAMP TIMESTAMP
BLOB BINARY Up to 8MB
CLOB VARCHAR Up to 16MB
BYTE BINARY
VARBYTE VARBINARY
GRAPHIC VARBINARY
JSON VARIANT
ARRAY ARRAY

Teradata to Snowflake Migration Guide 15


Appendix D—SQL considerations
Below are examples of some changes you may need to make to
your Teradata SQL queries so they will run correctly in Snowflake.
Note that this is not an all-inclusive list.

LIKE ANY QUALIFY


This does not exist in Snowflake. Change the query This is the equivalent of a HAVING clause within a
comparison as below: Windowing function and in ANSI SQL requires an inner
query to perform the Windowing, with an outer query to
perform the filter:

T E R A D ATA SNOWFL AKE T E R A DATA SNOWFL AKE

SELECT * SELECT * SELECT c1 SELECT * FROM


FROM PRODUCT FROM PRODUCT FROM t1 (SELECT c1,
WHERE DESC LIKE ANY ('%yog%', WHERE (DESC LIKE '%yog%' WHERE c1='A' ROW_NUMBER() OVER (PARTITION
'%milk%') OR DESC LIKE '%milk%') QUALIFY by c1 ORDER BY c1) rn
ROW_NUMBER() OVER FROM t1
(PARTITION by c1 ORDER BY c1) = 1 WHERE c1='A'
) WHERE rn = 1

Teradata to Snowflake Migration Guide 16

You might also like