You are on page 1of 40

Great

Data.
By Design.

Great data isnt an accident.


It happens by design.
Ensuring that you have the clean, safe,
connected data you need to power confident
decisions and effective business processes
isnt an easy task.
You have to work at it

The challenge is that the


market trends are working
against you as data
professionals.

Your jobs are getting


harder.

Market Trend

#1
More Data. In More Places.
Moving Faster Than Ever
Before.

The volume, velocity, and variety of data is


increasing at an unprecedented pace. The
amount of data generated in the world
today is doubling every two years.
Its the new Moores law.

2020
35.2
Zettabytes
2009
0.8
Zettabytes

Technologies like Hadoop allow


us to affordably store vast
amounts of data.
The power of mainframe
computing now fits in the palm
of our hands.
And, to top it off, we are attaching
RFID devices and sensors to
everything.

Take jet
airplanes
for
example.
A jet aircraft engine has up to 3000 sensors
on it, and they are constantly throwing off
data. The amount of data that comes off an
engine during flight ranges from .5 TB to 4
TB.

And we are only just beginning.

The volume, variety,


and velocity of data
will only continue to

increase.

Market Trend

#2
Data is Everywhere
and Its Quality is
Questionable

Databases
Mobile Devices

Obscure Legacy Systems

Flat Files

Sensors

Mainframes
Social Media

Flat Files

Data is
scattered
everywhe
re

CRM Applications

Message Queues

Hadoop Clusters

Cloud
ERP Applications

Unstructured Docs

Its in all the old places, and all the new


ones.
Both on-premise and in the cloud.

It used to be that data


integration projects were
limited or put at risk by the
cost and performance of CPU,
memory, network, or disk.
Today, thats no longer the
case.
Now were limited by our ability
to deal with data that is
fragmented and of poor or

To realize the full value


of their data,
organizations need to be
able to integrate it
across the entire
enterprise.
And data quality needs to be built
into the process. Much like
manufacturing went through a
transition in the 80s where the
quality steps for building products
were baked into the manufacturing
process the same needs to be done
with data.

12

Market Trend

#3
The Business Wants Self
Service

Over the last five years, business users have


become more technically savvy. Easy-to-use
technology now plays a large role in their personal
lives, helping them do things faster, easier, and
better. It has empowered them. And they expect
the same experience at work.
Social
Networking

Apps

Mobility

Search

The Empowered
Consumer

The business
doesnt want to
wait for IT to
deliver great data.
They want to do it
on their own.

There are (some pretty cool) self-service


tools that allow them to visualize their
data.

The trouble is, they only work for a single data set at a time.
When the business needs data that crosses business

Or worse yet, they come back to IT


because they have done all they can with
their self-service tools and then realize that
the data they are using is mission-critical
and requires mission-critical processes

that they
cant run
on their
laptop.

Self-service can only take


the business so far.

A new way of
thinking is needed

A lot of companies believe that the way to


achieve competitive advantage is to focus
on their core business processes.
If were the best at
what we do, we
can beat the
competition.

And they
arent entirely
wrong.

They believe that by investing in


applications to support those core business
processes they can use the new
efficiencies or the improved service that
comes from those efficiencies for
competitive advantage.
We need an
application that will
automate and
improve our core
processes, so we
can beat the
competition.

And they arent


entirely wrong.

The trouble is: people still think about their


business application as a single, monolithic
thing.

Business-Critical Application

Thats where theyre


wrong.

The reality is that these processes and the


core applications supporting them arent a
single monolithic thing. Any business
process today is highly distributed across
multiple systems

Business-Critical Applications

. and the number of systems and data


points to which data must flow in or out is
only increasing.

Business-Critical Applications

It is generally true that innovation exists at


the edges of boundaries, or the
intersection of different disciplines.

Innovation happens here

As more data gets created across more


systems, the ability to integrate and
intersect data across those boundaries
becomes a critical success factor for the
next generation of innovation.
Who are our most
profitable
customers?

How do I
accelerate my
supply chain?
Business-Critical Applications

Can we create new


information based
services to offer our
customers

Can we drive
efficiencies in
our procurement
processes?
How can I
improve
collaboration
between
suppliers and
contractors?

Do we have all the


data we need to
support our
compliance
constraints?

But integrating data is


harder than most people
think.

Take the jet aircraft, for example.


While the engines may be the same
from plane to plane, the data
coming off of them via their 3000
sensors -- is not controlled by the
engine manufacturer. Its controlled
by the airlines.
And each airline stores those same
3000 attributes in their own format.

Which means that when the data


for the same kind of engine is sent
back to the manufacturer for
analysis, they first have to
normalize it. What would seem like
an easy exercise analyzing data
from the same kind of engine -- is
much harder than itThe
looks.
additional
challenge is that
the legacy data
never dies and
has to be pulled
in as well.

Every data project is like this. It is


always harder than anyone thinks
and the number of moving parts
is only increasing.

To overcome this
challenge, you have to
design great data into
your business processes.

Just like you invest in people, process, and


technology for your core business
processes, you have to invest in people,
process, and technology to integrate the
distributed data that supports those
processes.

That is because business agility now depends on


data integration agility. And data integration
agility depends on getting everyone involved -and ensuring that the business and IT have the
right tools to enable collaboration. In fact, weve
seen that in companies where the business and IT
collaborate, DI projects are executed 5x faster
than in companies where they dont.

Considerations
for Designing
Great Data

34

#1

Connect to all your data


RDBMS, Flat Files, XML, Hadoop, NOSQL, Social
Media, Mainframe, Machine Data, and More
Data integration enables you to combine
data from many different and rich
sources to produce new business
information you couldnt get from a
single source. Make sure your data
integration tools are able to connect to
any data source (both current and
legacy) including RDBMS, NOSQL,
mainframe, text, applications, and so on
and not just the data sources you
consume today. Its this universal set of
connections that makes it possible to
bring all that data together.

#2

Support the Right Format


and Latency
Batch, Real-Time, Near Real-Time. Structured,
Unstructured, Semi-Unstructured.
In the same way data integration draws
data from many different sources, it also
must be able to consume various and
multiple data types, including structured,
semi-structured, and unstructured data
sources in batch and real-time modes.
You need a tool that is flexible enough to
work with any type of data you
encounter.

#3

Understand Data Structure


and Content
Include Data Profiling in Your Methodology
With so many different sources of data
involved, you need to have a means to
make sure that your data is what you
expect. Its important that your tools
allow a level of data profiling so that you
can verify the data going into and out of
your system, and ensure that youll end
up with the desired results.

#4

Enable Effective Business


and IT Collaboration
Be Agile and Lean
You cant afford to create and execute
projects using traditional, isolated
development methods anymore. Your
data integration tools need to support
lean and agile integration processes that
enable business and IT collaboration so
that development happens quickly and
interactively.

#5

Support Business Growth


and Expansion
Be Able to Scale Up and Scale Down
Companies grow, and so do the sizes of
their projects. You dont want to be
locked into tools that are only appropriate
for todays projects. Rather, you want
tools that have the ability to scale, grow,
and move projects from small
departmental innovative exercises to
large enterprise mission-critical
environments, or vice versa.

Learn how you can build


great data, by design.
Click here to download

You might also like