You are on page 1of 10

Future-Proof Your Data Lake

Environment
Maximize the Value of Big Data
to Your Organization

Sponsored by Arcadia Data


Data Lakes Are Valuable
Data lakes add value. Their use improves organizations’ operations and
business performance. Our research underscores this; four out of five
participating organizations consider the analytics enabled by data lakes
to be important. Data lakes deliver benefits in the form of improved
competitive advantage, better responsiveness and faster time to market
as well as better communication and alignment in internal processes.

But these benefits aren’t automatic. To realize them, an organization


should choose technologies and institute processes that support the
lines of business in their efforts to access and take full advantage
of data lakes.

akeaway: Data lakes provide significant


business benefits.

1 2 3 4 5 6 7 8 9 10
Tool Choices Are Changing
To realize value, users need to be able to access and analyze the information available in the data lake
easily and efficiently. Our research shows that they prefer to use business intelligence tools for analysis.
However, many of these tools buckle under the volumes of data managed as part of a data lake. For
this reason, users need new approaches to effectively utilize data lakes.

The tools to extract information from a data lake are changing and so are the
underlying data-management technologies. While Hadoop deployed on-premises
was the primary platform for many of the first data lake deployments, NoSQL
technologies and cloud-based deployments are also becoming common. Cloud
object stores such as Amazon S3 coupled with Spark processing are also
increasingly considered viable alternatives.

akeaway: Data-lake technologies are changing, requiring


new approaches to business intelligence.

1 2 3 4 5 6 7 8 9 10
Go Beyond Transaction Data
The size-constrained data warehouse, the technology
predecessor of the data lake, typically contained transactional
finance, sales and operational data. Data lakes can handle
not only a greater volume but also a greater variety of data.

Given that, an organization can maximize the value of its data


by including for analysis not only transaction data but also,
for example, interaction data from customer service centers
to get a more complete picture of the products and services
offered as well as customers and prospects. Event data and
machine data can also provide detailed information about
an organization’s operations and its systems. The information
can be used to improve logistics, supply chain activities,
websites and other aspects of the organization.

akeaway: Consider a broader variety of data


sources for analysis.

1 2 3 4 5 6 7 8 9 10
Consider External Data Sources
Our research shows that external data sources are Important External Data Sources
commonly used as input to data lakes, second only for Big Data Analytics
to transaction data. For example, nearly half (46%)
Cloud computing
of organizations are incorporating social media data business applications
into their data lakes to provide a more informed view Social media
of customers and prospects. Analysis of social media
Economic data sources
also can help an organization improve the performance
Internet information
of products and services. sources
Consumer demographic
There’s an array of external data services available sources

including economic, demographic, market, government


and weather data. Combining these data sources with
internal data for analysis provides a more complete akeaway: Including external
picture of customer and market dynamics, enabling data sources in data lakes yields more
better-informed business decisions. useful analyses.

1 2 3 4 5 6 7 8 9 10
Data Streams In Constantly
Virtually every aspect of an organization involves business processes that generate data continuously.
Today these include IoT devices, production lines, logistics activities and websites as well as IT systems.
Financial markets generate trade data continuously. Social media streams continuously.

Some organizations are turning this information availability into an opportunity – nearly one in five
(19%) are processing data in real time.
As it plans for the future, the organization must realize that the batch processes it has used to feed
static data warehouses are no longer sufficient. Nor are analytical tools that assume the underlying data
is static. It must ensure that its data lake architecture can accommodate processing and analyzing the
growing and accelerating streams of dynamic data.

akeaway: Anticipate more


streaming data sources
and analyses.

1 2 3 4 5 6 7 8 9 10
Enable New Types of Analytics

Data lake investments should enable analytics that extend beyond those
currently being used. With all the detailed data stored in a data lake,
organizations can perform analyses at a much finer level of detail
than was previously possible. Such detailed access allows
an employee to follow a train of thought – from high-level
trends to specific subsets of data in that trend – within
a single environment.

This detailed data availability also enables forward-looking


predictive analytics using machine-learning and
artificial-intelligence techniques. Many of these require
large amounts of computational power, so investments
should support emerging hardware-acceleration
techniques such as GPUs.

akeaway: Organize investments in data lake


projects to enable new analytics.

1 2 3 4 5 6 7 8 9 10
Avoid Vendor or Platform Lock-In
With the continuing evolution of data lake technologies,
it’s important to adopt a flexible architecture that can
accommodate changes. Most data resides on-premises
today, but 44% of organizations operate in a hybrid data
environment. Organizations increasingly are using
a variety of big-data technologies, among them Hadoop,
NoSQL, cloud-based object stores, graph databases and
GPU databases.

Therefore, to protect investments in the data lake, tools


should be able to handle a variety of sources, run anywhere
rather than in just a single on-premises or cloud environment
and evolve to support new sources as they are adopted.

akeaway: Flexibility is important in data lake investments


to deal with a frequently changing technology landscape.

1 2 3 4 5 6 7 8 9 10
Provide Self-Service

Line-of-business units need on-demand access to data Self-Service Improves


to make the most informed business decisions. Our Use of Analytics
research shows that 88% of organizations consider
it important to provide access to data without the Access analytics
72%
involvement of IT. However, fewer than half (44%) are without IT

comfortable providing that direct access. This hesitation Satisfied


is most often due to concerns about governance, risk
and security. IT needed to
acccess analytics 54%
Organizations that can overcome these issues and
provide self-service big-data analytics get better results.
When business units design and deploy big data
75% are satisfied with the resulting
analytics, nearly akeaway: Design your
insights; however, only 54% are satisfied when IT data lake to extend self-service
designs and deploys the analytics. capabilities and reduce
IT bottlenecks.

1 2 3 4 5 6 7 8 9 10
Stay Ahead of the Game
Data lakes can provide significant value to your organization and
help it improve its business processes. They support the analyses
an organization needs to have a complete picture of its operations
and its customers, allowing it to stay ahead of its competition.

Data lake architectures and technologies continue to evolve at


a rapid pace. In order to stay ahead of these changes, we recommend
that you adopt tools and processes that are flexible and adaptable.
With this approach your organization can continue to leverage data
to improve its performance even in the face of change.

Sponsored by

The Ventana Research benchmark research reports Big Data Analytics, Data Preparation,
and Data and Analytics in the Cloud can be found at www.ventanaresearch.com.
© Ventana Research 2018. All rights reserved.

1 2 3 4 5 6 7 8 9 10

You might also like