You are on page 1of 12

DATA WAREHOUSE

TRENDS REPORT
2017

Data Warehousing
Preferences and Challenges
Panoply Annual Survey
Table of Contents
Executive Summary 3

The Data Warehouse 3

Key Findings and Conclusions 4

The Survey Results 7

About the Survey Sample 10

Final Notes 11

About Panoply 11

Copyright Notice. Any Panoply information that is to be used in advertising, press releases, or promotional
materials requires prior written approval from the Panoply CEO. A draft of the proposed document should
accompany any such request. Panoply reserves the right to deny approval of external usage for any reason.
Copyright 2017 Panoply Ltd. Reproduction without written permission is completely forbidden.
2 0 1 7 PA N O P LY DATA WA R E H O U S E T R E N D S R E P O RT

DATA WA R E H O U S I N G P R E F E R E N C E S A N D C H A L L E N G E S

Executive Summary
Amazon re:Invent is always a rewarding experience, providing not only opportunities
to demonstrate Panoplys automated data warehouse solutions to thousands of IT
professionals, but also to gather feedback from industry professionals, as a means to
gauge cloud-industry trends and understand the markets most pressing needs. In
that spirit, we conducted a survey during our time at re:Invent 2016 that offers insight
into todays use of data warehousing systems, in particular Redshift, while also
exposing users biggest challenges and advantages.

Our survey includes input from more than 800 re:Invent attendees who answered a
ten-question survey, investigating level of satisfaction with their current data
warehousing solution and exploring more deeply the grounds for their responses.
The respondents come from diverse sectors, and hold a wide variety of positions or
specialties within their organizations.

Findings from the survey support what industry experts such as Gartner have been
saying:

Cloud-based data warehousing solutions, such as Amazon Redshift, are


transforming the market, leading to a noticeable shift in industry
leadership and the way vendors will have to approach clear needs and
challenges that still exist in data warehousing.

The Data Warehouse


The need to organize and process data was already recognized in 1970, when
(according to Wikipedia), the ETL process began to be more widely accepted.
However, in recent years, the approach to data storage, organization, and analysis has
been undergoing a radical change, led by a desire from executives in various
industries who envision their organizations as data-driven businesses, and therefore,
seek solutions from the industry.

A data warehouse is a collection of data in which multiple disparate data sources can
be loaded and integrated together into the same repository. The systems logical

3
2 0 1 7 PA N O P LY DATA WA R E H O U S E T R E N D S R E P O RT

DATA WA R E H O U S I N G P R E F E R E N C E S A N D C H A L L E N G E S

design facilitates the integration of data sources and allows the generation of new,
additional valuable data sources without signicant structural adjustment.
Ultimately, a data warehouse should be larger than the sum of its data, and serve as
an ongoing intelligent resource for use by multiple members of an organization,
large or small.

For that to happen, data warehouse technologies require data virtualization,


processing, and transformation methods. The are several delivery models, including
physical appliances, such as dedicated traditional storage subsystems built to
support analytics and BI performance. With the addition and ongoing evolution of
the cloud, you will nd today cloud-based serverless solutions such as Amazon
RedShift 1
and Google BigQuery 2
that aim to simplify both the hosting of and
analysis of data in an increasingly complicated environment.

In addition to the explosive growth in the amount of data and data sources weve
seen in recent years, another motivation for creating even more sophisticated data
warehousing systems is the ever-increasing need for customizable business
intelligence and analytics.

In this fast-paced, innovative environment, challenges remain, however. This,


perhaps, was one of the most signicant ndings of our recent survey. As as result
data repositories in organizations remain isolated, and every day companies lose
advanced data integrations opportunities such as insights the can be driven from
integrating their own data with public data sources available.

In this paper, we will examine what these results imply about the current state of
affairs in the data warehouse community, as well as what industry leaders must
address in order to adapt to and keep up with their data demands. We will see how
Amazon Redshift continues to satisfy its core base of customers, and how the results
indicate what experts predict; that AWS Redshift is gaining traction amongst its
strongest competitors.

1 https://aws.amazon.com/redshift/
2 https://cloud.google.com/bigquery/ 4
2 0 1 7 PA N O P LY DATA WA R E H O U S E T R E N D S R E P O RT

DATA WA R E H O U S I N G P R E F E R E N C E S A N D C H A L L E N G E S

Key Findings and Conclusions


The following are the major takeaways from the Amazon re:Invent 2016 survey we
conducted:

RedShift is a leading data warehouse solution


As the survey was conducted at an Amazon event, it wasnt a revelation to learn
that the majority of the respondents, 60%, use Amazon Redshift. Interestingly,
however, the next largest group, at 21% of all respondents, run an on-premise
solution, as opposed to a cloud-based one, while only 3% responded as being
users of BigQuery. This was somewhat of a surprise considering the recent
media buzz around the solution. but not wholly considering Google is not
recognized as a major market player by Gartner.

Management complexity is the main challenge


We found that most respondents, almost 60%, claim their data warehouse
difcult to manage. On the bright side, however, 51% of them were satised by the
solutions performance. In addition, 60% of the respondents indicated that their
major complaint was complexity. 20% of the results that came from respondents
challenged by complexity, were ones working in enterprises using a cloud-based
solution, with 41% of all Redshift users say it is complex. That most respondents
were technical professionals, such as IT specialists and developers, means that
ease-of-use is still a weak spot for data warehouse providers and streamlining
operations still a challenge they need to meet.

The majority are not using ETL tools


When asked which ETL tools you use, respondents t named the three ETL
leaders: Segment, Stitch and Talend. However, still 61% of respondents are not
using any ETL tool. Assuming that most of them are Amazon Redshift users and
assuming a correlation between these two ndings, together with the complexity
challenge, is not so surprising. In comparison to traditional DWH solutions,
Amazon Redshift is still relatively a new solution and though we believe it will
take the the lead in the next years to come, there are still management gaps, such
as multi-tenancy and simple streamlined loading of data to the cloud.

5
DATA WA R E H O U S E T R E N D S R E P O RT 2 0 1 7

DATA WA R E H O U S I N G P R E F E R E N C E S A N D C H A L L E N G E S

Adoption of BI-based cloud solutions


Finally, in seeking to learn which BI tools users prefered, 35% reported using
Tableau (with 67% of those using Redshift). The next major group, 25% of all
respondents, report using no BI tool at all. Following these results and the fact
that majority are not using any ETL tools we can condently conclude that AWS
Redshift opened the data warehouse opportunity but still organizations are at
beginning of building an end to end BI solution.

All in all, organizations are becoming increasingly aware of the value of data
warehousing beyond simple storage; theyre calling for better ways to extract
information from their data and analyzing it. We believe that Amazon Redshift holds
the key to simplifying that task, making data warehouse accessible and effective not
only for large and well-funded enterprises, but also medium and, perhaps, even
small ones.

The fact is that companies are employing a data warehouse solution but still suffer
as a result of its complexity, plus dont use ETL tools. And while the survey results
indicate that Amazon Redshift is experiencing traction, we also conclude that the
use of cloud-based data warehousing, complemented by a rich BI tool such as
Tableau, is still in its early stages.

6
DATA WA R E H O U S E T R E N D S R E P O RT 2 0 1 7

DATA WA R E H O U S I N G P R E F E R E N C E S A N D C H A L L E N G E S

The Survey Results


THE CHOICE OF A DATA WAREHOUSE SOLUTION

WHAT DWH SOLUTION YOU USE

60% 21% 9% 7% 3%
Redshift On premise Azure Other BigQuery

HOW CHALLENGING IS YOUR DWH MANAGEMENT?

59% Difcult

27% Easy
10% Very Difcult

4% Very Easy

OVERVIEW OF CUSTOMER LEVEL OF SATISFACTION


WITH CURRENT DATA WAREHOUSING SOLUTION

HOW SATISFIED ARE YOU WITH YOUR DWH?

64% 25% 8% 3%

Satised Unsatised Very Satised Very Unsatised

7
DATA WA R E H O U S E T R E N D S R E P O RT 2 0 1 7

DATA WA R E H O U S I N G P R E F E R E N C E S A N D C H A L L E N G E S

INVESTIGATION INTO REASONS FOR SATISFACTION OR DISSATISFACTION

WHY ARE YOU SATISFIED WHY ARE YOU UNSATISFIED

4% 9%
19%

25% 47%
26%
51%
19%

PERFORMANCE COMPLEXITY
OTHER COST
SIMPLICITY PERFORMANCE
COST OTHER

REDSHIFT USERS
UNSATISSFIED-REASONS

41% 37% 22%


Due to Complexity Other Performance

8
DATA WA R E H O U S E T R E N D S R E P O RT 2 0 1 7

DATA WA R E H O U S I N G P R E F E R E N C E S A N D C H A L L E N G E S

USE OF ETL TOOLS

WHAT ETL TOOL DO YOU USE

61% NONE

10% TALEND

9% SEGMENT

9% OTHER

6% STITCH

3% FIVETRAN

2% ATOM

USE OF BI TOOLS

WHAT BI TOOL DO YOU USE?

35%
25%

8% 7% 6% 5% 5% 4% 3% 2%

Tableau None PowerBI DataBricks Other QlikView Domo Looker Chartio Sisense

DWH MOST USED BY TABLEAU USERS

67%
Redshift

17% 9%
On Premise Other 7%
Azure

9
DATA WA R E H O U S E T R E N D S R E P O RT 2 0 1 7

DATA WA R E H O U S I N G P R E F E R E N C E S A N D C H A L L E N G E S

About the Survey Sample


Our research includes the responses of 833 visitors to our booth at re:Invent 2016
who responded positively to our request for participating in this survey.

Noting that re:Invent participants, in general, and especially visitors to our booth,
were most likely cloud users, we assume the population from which we gathered
information skews in the direction of cloud-based data warehouse users; and
possibly even towards Amazon Redshift users, in particular.

Despite the possible biases of our respondents in terms of their preference toward
Amazon services, the individuals who participated in the survey do span various
industries from software to nance, apparel to healthcare. Their roles within the
companies range from IT to management, data engineering to sales and marketing.
Therefore, we believe the results to be comprehensive and reective of a wide
spectrum of users, inuencers, and decision-makers within data organizations.

COMPANY SIZE

300

250

200

150

100

50

1-50 51-100 101-500 501-1000 1001-5000 5000+

10
DATA WA R E H O U S E T R E N D S R E P O RT 2 0 1 7

DATA WA R E H O U S I N G P R E F E R E N C E S A N D C H A L L E N G E S

COMPANIES USING A CLOUD DWH BY NUMBER OF EMPLOYEES

30% 19% 11% 10% 16% 14%


+5001 1001-5000 501-1000 201-500 51-200 1-50

RESPONDANTS BY INDUSTRY

3% 13% 8% 7%

Education Finance & Government Health


Insurance & ONGs & Pharma

22% 3% 7% 30%

Internet Manufacturing Retail Technology


& Travel

2% 2% 2% 1%

Construction & Consulting & Media & Other


Utilities Services Entertainment

11
DATA WA R E H O U S E T R E N D S R E P O RT 2 0 1 7

DATA WA R E H O U S I N G P R E F E R E N C E S A N D C H A L L E N G E S

Final Notes
Survey Results Align with Gartners Magic Quadrant for Data Warehouse and Data
Management Solutions for Analytics

In March, in conjunction with its release of 2016 Magic Quadrant for Data Warehouse
and Data Management Solutions for Analytics, leading industry analyst Gartner
cautioned all market leaders, including IBM, Microsoft, Oracle, SAP, and Teradata, to
recognize the competition facing them as data warehousing has moved into the
cloud, in particular from Amazon Redshift, who we see has gained great traction. Our
survey supports Gartners conclusions and shows that due to a growing desire for
better use of data, as well as system management challenges data professionals still
face, Amazon Redshift continues to provide users with a more satisfying data
warehouse experience, especially when complemented with products like those
offered by Panoply, that ll the holes in AWS Redshifts service.

Gartner notes the continued impact the public cloud is having on the way IT
professionals approach their organizational responsibilities, as well as users
expectations for logical data warehouses. They predict a data warehouse
transformation as a result; another reason Amazon Web Services is getting closer and
closer to becoming a real contender for market leadership, especially when supported
by end-to-end data management platforms such as Panoply.

About Panoply
Our story begins with an idea: In the Big Data era, free up your data engineers and scientists, and you create
value for your customers and your business. Its simple, right? We believe in taking the load off the IT and data
engineers that have long been mired in time-intensive tasks like schema building, data mining, complex
modelling, performance tuning... Our easy-to-use platform gives small and medium businesses the tools to
harness Big Data and get analytics quickly, so they can make faster and better business decisions.

Panoply.io 1161 Mission St. San Francisco, CA 94103 hello@panoply.io