Professional Documents
Culture Documents
Documentation: Community Resources Blog English
Documentation: Community Resources Blog English
DOCUMENTATION
Community
Resources
Blog
o ENGLISH
Getting Started
Introduction to Snowflake
Tutorials, Videos & Other Resources
Release Notes
Connecting to Snowflake
Loading Data into Snowflake
o Overview of Data Loading
o Summary of Data Loading Features
o Data Loading Considerations
Preparing Your Data Files
File Sizing Best Practices and Limitations
Continuous Data Loads (i.e. Snowpipe) and File
Sizing
Preparing Delimited Text Files
Semi-structured Data Files and Columnarization
Numeric Data Guidelines
Date and Timestamp Data Guidelines
Planning a Data Load
Staging Data
Loading Data
Managing Regular Data Loads
o Preparing to Load Data
o Bulk Loading Using COPY
o Loading Continuously Using Snowpipe
o Loading Using the Web Interface (Limited)
o Querying Data in Staged Files
o Querying Metadata for Staged Files
o Transforming Data During a Load
o Data Loading Tutorials
Unloading Data from Snowflake
Using Snowflake
Sharing Data Securely in Snowflake
Managing Your Snowflake Account
Managing Security in Snowflake
General Reference
SQL Command Reference
SQL Function Reference
Appendices
NEXTPREVIOUS |
DOCS »
LOADING DATA INTO
SNOWFLAKE »
DATA LOADING
CONSIDERATIONS »
PREPARING YOUR DATA FILES
In this Topic:
If your source database does not allow you to export data files in smaller
chunks, you can use a third-party utility to split large CSV files.
Linux or macOS
The split utility enables you to split a CSV file into multiple smaller files.
Syntax:
split [-a suffix_length] [-b byte_count[k|m]] [-l line_count] [-p pattern] [file [name]]
Example:
Windows
Windows does not include a native file split utility; however, Windows supports
many third-party tools and scripts that can split large data files.
In general, JSON and Avro data sets are a simple concatenation of multiple
documents. The JSON or Avro output from some software is composed of a
single huge array containing multiple records. There is no need to separate
the documents with line breaks or commas, though both are supported.
For the most efficient and cost-effective load experience with Snowpipe, we
recommend following the file sizing recommendations in File Sizing Best
Practices and Limitations (in this topic). If it takes longer than one minute to
accumulate MBs of data in your source application, consider creating a new
(potentially smaller) data file once per minute. This approach typically leads to
a good balance between cost (i.e. resources spent on Snowpipe queue
management and the actual load) and performance (i.e. load latency).
Creating smaller data files and staging them in cloud storage more often than
once per minute has the following disadvantages:
Various tools can aggregate and batch data files. One convenient option is
Amazon Kinesis Firehose. Firehose allows defining both the desired file size,
called the buffer size, and the wait interval after which a new file is sent (to
cloud storage in this case), called the buffer interval. For more information,
see the Kinesis Firehose documentation
{"foo":1}
{"foo":"1"}
Format Description
HH12 Two digits for hour (01 through 12); am/pm allowed.
AM , PM Ante meridiem (am) / post meridiem (pm); for use with HH12.
TZH:TZM , TZHTZM , TZH Time zone hour and minute, offset from UTC. Can be
prefixed by +/- for sign.
Oracle only. The Oracle DATE
data type can contain
date or timestamp information.
If your Oracle database
includes DATE columns that
also store time-related
information, map these
columns to a TIMESTAMP data
type in Snowflake rather than
DATE.
Note
Snowflake checks temporal data values at load time. Invalid date, time, and
timestamp values (e.g., 0000-00-00) produce an error.
NEXTPREVIOUS |
ASK THE COMMUNITY
CONTACT SUPPORT
REPORT DOC ISSUE
follow us
Solutions
o Use Cases
o Media & Entertainment
o Healthcare
o Financial Services
o Retail & CPG
Products
o Overview
o Why Snowflake
o Architecture
o Data Warehouse Security
o Pricing
Resources
o Resource Library
o Support & Services
o Documentation
o Legal
o Community
Explore
o News
o Events
o Webinars
o Blog
o Trending
About
o About Snowflake
o Partners
o Leadership
o Snowflake Board
o Careers
o Contact
450 Concar Drive, San Mateo, CA, 94402, United States| 844-SNOWFLK (844-766-9355)