You are on page 1of 7

Executive Briefing Series:

DATA ANALYTICS

Sponsored by
Data Analysis:
Refining ITs New Oil
Teradata Panel of Experts BY TOM TEMIN
Thomas Sasala, the chief data

F
officer of the Department of the
Army or federal agencies, data has emerged
as a new kind of currency. Its not
 dward Kearns, Ph.D., the chief
E
that data itself is new, but rather
data officer at the National Oceanic
the nature and value of data has
and Atmospheric Administration
changed. Where agencies once held data
Jodi Morton, vice president for only in highly structured databases tied to
Single-Family Data Governance and specific applications, they now also have data
Management at Freddie Mac from many other sources and in a variety of
Michael Peckham, the director of forms and formats. Network logs, sensors
the DATA Act Program Management in a variety of domains, social media, and
Office at the Health and Human the various financial and human resource
Services Department systems within agencies are just a few of the
Christina Ho, deputy assistant sources generating this data.
secretary for accounting policy and
financial transparency at Treasury The resulting asset has various names data
holdings, data equities, for example. However,
David Kuenzel, the vice president like crude oil, datas value remains largely
and general manager of Teradata potential unless agencies have strategies for
Government Systems, LLC processing and analyzing it. SQL queries and
Alan Ford, Teradatas director of reports remain important tools, but they
government systems pre-sales dont work in mixed or unstructured data
environments. For greater scientific insight,
more efficient program management and
 r. Timothy Kropp, senior adviser
D
the development of digital services, agencies
to the chief data officer in the HHS
need a 21st century data strategy.
Office of Inspector General
Kris Rowley, chief data officer of To explore what these strategies might
the General Services Administration look like and what tools they might require,
Federal News Radio convened a panel of

Executive Briefing Series: Data Analytics


federal data officers joined by a pair of In short, data really has become the crude
subject matter experts from a leading data oil of IT. Properly refined it ensures no
analytics tools vendor. manager needs ever make a decision in a
vacuum, said Michael Peckham, the director
From the military side, Thomas Sasala, the of the DATA Act Program Management
chief data officer of the Department of the Office at the Health and Human Services
Army, says the Army maintains four major Department.
portfolios of data, only one of which is
tactical. Yet all of the data stores can, in some For agencies, then, the next generation of
manner, contribute to the Armys knowledge IT activity will include innovative ways of
of its own readiness and where the gaps lie. obtaining greater value from the data they
So his focus is on establishing a readiness have. Doing so will require a three-pronged
dashboard informed by all relevant data strategy:
sources.
1. Establishing a culture of data sharing
For Edward Kearns, Ph.D., the chief among components. If people arent
data officer at the National Oceanic and talking, you know the systems arent
Atmospheric Administration, the challenge is talking, Peckham said. Also required
making some 30 petabytes of public domain is some evangelism to help users
data more accessible. Because of the volume understand the data available to them,
of data and other staff duties, this constitutes and how they can pose new questions
a strain on resources. about their work.

The challenge for Jodi Morton, vice president 2. Reorienting data stores to optimize value
for Single-Family Data Governance and using data analytics. No single form of
Management at Freddie Mac, is embedding storage, whether so-called data lakes,
a more data-oriented culture both in the IT warehouses, no SQL databases or fully
staff and in the lines of business owners, structured databases, will necessarily be
such that they can realize the value of their the one solution needed.
decisions based on data analytics. There is
a fair amount of activity in gathering and 3. Installing an infrastructure of tools for
sharing data, maintaining the corporate data analyzing and visualizing data, so it
warehouse, and culling unneeded assets, ultimately becomes useful.
such as tables, no one uses anymore.

Executive Briefing Series: Data Analytics


Data policies the relationships among the data sets, to help
potential users in agencies, Congress and
industry.
Federal IT policy development over the
last several years has, in part, emphasized Its also important to manage the sheer
open standards, data sharing and the use of growth in data. David Kuenzel, the vice
data to improve program management and president and general manager of Teradata
transparency. Government Systems, LLC, pointed to one
example: the Centers for Medicare and
An important example comes from the Medicaid Services (CMS). About a decade
Digital Accountability and Transparency ago, he said, the data in the Integrated
Act. It brings together many of the issues Data Repository(IDR) at CMS amounted
surrounding the bigger data challenge. to about five terabytes. Today, the IDR has
Enacted in 2014, the DATA Act seeks to grown to a high-end 1-petabyte system,
make federal spending data available and and is now integrated with a large Hadoop
cogent by requiring agencies to report it system, thereby creating more holistic
in a common format. The law charged the analytical capabilities.
Office of Management and Budget and the
Treasury Department with establishing That volume also yields, with the right
nonproprietary, searchable, and widely analysis tools applied, patterns leading to
accepted standards for spending data. fraud discovery. The agency has recovered
billions under its Fraud Prevention Initiative,
Christina Ho, deputy assistant secretary for Kuenzel said.
accounting policy and financial transparency
at Treasury, leads implementation of the Kuenzel, and Alan Ford, Teradatas director
DATA Acts provisions on standards. Data of government systems pre-sales, said federal
on $4 trillion in annual spending, she data officials are starting to rethink how they
said, resides in some 100 agencies, each manage data. Kuenzel said a growing number
with multiple systems. She said that while of agencies have more than a petabyte of
matching data formats sounds simple, the data. The cool and new thing out now,
complexity builds, thanks in part, to the Kuenzel said, [is that for] some of the data
number of elements required to make sense sets that arent necessarily new, now there
of federal spending. are technologies that can help agencies
leverage them and pull them into their
We were able to develop industry-based, analytical ecosystem.
integrated data standards that have
budget, accounting, procurement, financial He added, agencies are still working to
assistance, which includes loans, insurance, integrate data sources using open source
and grants, Ho said. She added, Treasury tools into various settings, such as data lakes.
is seeding certain analytics products on a These tools are sometimes like a free puppy,
public platform, because of the complexity of meaning the upfront capital costs are low, but

Executive Briefing Series: Data Analytics


they may come with significant integration
and lifecycle costs, and many times,
the Data Scientists required to leverage
Data strategies and the user
these technologies are hard to find in the If analytics is to answer users questions,
marketplace. Recently, many agencies are analytic products need to be user oriented,
looking at ways to bring multiple data sets Treasurys Ho said. IT needs to understand,
into a more unified data architecture, where what are their questions? What are their pain
these data sets can now easily integrate and points? If data is the new oil, we have been
communicate with the more traditional data spending a lot of time drilling, and now we
sets sitting in data warehouse technology, have to start using it, she said.
thereby offering up a more integrated
analytical ecosystem. No agency should assume a single analytics
strategy can fit every situation, said Dr.
It turns out, Ford said, that so-called data Timothy Kropp, senior adviser to the chief
lakes and other large collections of data from data officer in the HHS Office of Inspector
multiple sources and in multiple formats can General. Data stewards are likely to deal
be difficult to query. with many skill levels, he said, from average
program users to data scientists.
The data lake can be more like a swamp.
Its hard to do analytics on Hadoop, he said, It therefore becomes important to work with
referring to the open source repository for users, understand what they need to know,
large data sets. The biggest bang for your and assemble analytics apps using carefully
buck analytically remains putting your chosen, relevant data sets for the problem
structured data into a well-modeled SQL at hand. Thats a very different approach
data repository. from tossing everything into a warehouse or
Hadoop instance and throwing tools at it.
The challenge then becomes how to make use
of unstructured and high-volume streaming Freddie Macs Morton stressed the importance
data within the analytics program, alongside of communication with users. I can buy all
the well-modeled data. Now, Ford said, tools the tools and get all the data in a lake, and it
are emerging that can do this efficiently under doesnt mean anything if our senior executives
a model he called intelligent federation. dont know what action to take. She asked her
data scientists to develop demonstration use
cases to show the so what of data analytics
to the business-line owners.

Executive Briefing Series: Data Analytics


By developing use cases for the data agencies
to generate and report, Ho said that Treasury
is trying to ensure the DATA Act becomes
Data analysis priorities
more than a compliance exercise. Her team Just as agencies have a variety of data
is using open source tools to demonstrate challenges, as noted in the beginning of
what analytics can derive from the data, and this white paper, they also have a range of
how those derivations can help a business priorities for their data programs.
challenge.
Peckham of HHS said many agencies, in
Peckham recommended an agile-like process response to the DATA Act, want to use
for getting at the core user needs for data data analytics to understand the return
analytics, starting with the questions for on investment of technology and other
which program managers want answers. expenditures. This can lead to better
In the case of Medicare fraud, an important acquisitions and better program outcomes.
goal is deriving analytics outcomes that He added, for every agency facing data calls
prosecutors would be able to use in court. from OMB or Congress, a well-ordered data
strategy, as envisioned by the DATA Act, can
Analytics, in essence, is an iterative process, help manage and speed them up.
he said. Its not something that gets served
up. Youre always creating something, but Sasala of the Army Department said his
someone says, Great tweak this. goal for data analysis is enabling improved
decisions. For example, matching personnel
Kris Rowley, chief data officer of the General information with logistics data can help
Services Administration, said simple to ensure the greatest lethality. Morton
questions of single transactional systems envisions lowering the costs and raising
are relatively easy to execute. But, As you the quality of instruments used for housing
escalate the level of the question and the finance.
complexity of the question, thats when you
need data scientists with better and more Regardless of the data program goals,
sophisticated tools, he said. reaching them requires a strategy that
incorporates multiple data sets in a well-
Panelists also recommended not overlooking architected system that allows for the
the importance of data presentation to users intelligent federation of unstructured, but
such that its cogent and understandable. important, data sets. The strategy must
This is where visualization tools come in, also include a way to make data analytics
and why its important to involve users at all meaningful and accessible to users.
stages of a data analysis project.

Executive Briefing Series: Data Analytics