Professional Documents
Culture Documents
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
What is Big Data?
Generate
Analyze
Generate
Store exabytes of
Individual AWS customers Collect & Store
data in S3
generating over PB/day
Analyze
Generate
Store exabytes of
Individual AWS customers Collect & Store
data in S3
generating over PB/day
Highly
Analyze
Constrained
Generated Data
Available for Analysis
Year
1990 2000 2010 2020
Sources:
Gartner: User Survey Analysis: Key Trends Shaping the Future of Data Center Infrastructure Through 2011
IDC: Worldwide Business Analytics Software 2012–2016 Forecast and 2011 Vendor Shares
The tyranny of “OR”
Anything you can dream up and code Optimized for data warehousing
But I don’t want to choose.
S3
SQL
High concurrency: Multiple No ETL: Query data in-place Full Amazon Redshift
clusters access same data using open file formats SQL support
Life of a query Query
SELECT COUNT(*)
1
FROM S3.EXT_TABLE
GROUP BY…
JDBC/ODBC
Amazon
Redshift
...
1 2 3 4 N
JDBC/ODBC
Amazon
Query is optimized and compiled at
Redshift
2 the leader node. Determine what gets
run locally and what goes to Amazon
Redshift Spectrum
...
1 2 3 4 N
JDBC/ODBC
Amazon
Redshift
...
1 2 3 4 N
JDBC/ODBC
Amazon
Redshift
...
1 2 3 4 N
JDBC/ODBC
Amazon
Redshift
...
1 2 3 4 N
JDBC/ODBC
Amazon
Redshift