You are on page 1of 7

Technology

Harnessing the power


of external data
Few organizations take full advantage of data generated outside
their walls. A well-structured plan for using external data can provide
a competitive edge.

By Mohammed Aaser and Doug McElhaney

© Getty Images

February 2021
Many companies have made great strides in for getting started. These include establishing
collecting and utilizing data from their own activities. an external-data strategy team and developing
So far, though, comparatively few have realized relationships with data brokers and marketplace
the full potential of linking internal data with data partners. Company leaders, such as the
provided by third parties, vendors, or public data executive sponsor of a data effort and a chief
sources. Overlooking such external data is a missed data and analytics officer, and their data-focused
opportunity. Organizations that stay abreast of teams should also learn how to rigorously
the expanding external-data ecosystem and evaluate and test external data before using and
successfully integrate a broad spectrum of external operationalizing the data at scale.
data into their operations can outperform other
companies by unlocking improvements in growth,
productivity, and risk management. External-data success stories
Companies across industries have begun
The COVID-19 crisis provides an example of just how successfully using external data from a variety
relevant external data can be. In a few short months, of sources (Exhibit 1). The investment community
consumer purchasing habits, activities, and digital is a pioneer in this space. To predict outcomes
behavior changed dramatically, making preexisting and generate investment returns, analysts
consumer research, forecasts, and predictive and data scientists in investment firms have
models obsolete. Moreover, as organizations gathered “alternative data” from a variety of
scrambled to understand these changing patterns, licensed and public data sources, many of which
they discovered little of use in their internal data. draw from the “digital exhaust” of a growing
Meanwhile, a wealth of external data could—and number of technology companies and the public
still can—help organizations plan and respond at a web. Investment firms have established teams
granular level. that assess hundreds of these data sources and
providers and then test their effectiveness in
Although external-data sources offer immense investment decisions.
potential, they also present several practical
challenges. To start, simply gaining a basic A broad range of data sources are used, and
understanding of what’s available requires these inform investment decisions in a variety
considerable effort, given that the external-data of ways:
environment is fragmented and expanding quickly.
Thousands of data products can be obtained — Investors actively gather job postings,
through a multitude of channels—including company reviews posted by employees,
data brokers, data aggregators, and analytics employee-turnover data from professional
platforms—and the number grows every day. networking and career websites, and patent
Analyzing the quality and economic value of data filings to understand company strategy
products also can be difficult. Moreover, efficient and predict financial performance and
usage and operationalization of external data may organizational growth.
require updates to the organization’s existing data
environment, including changes to systems and — Analysts use aggregated transaction
infrastructure. Companies also need to remain data from card processors and digital-
cognizant of privacy concerns and consumer receipt data to understand the volume of
scrutiny when they use some types of external data. purchases by consumers, both online and
offline, and to identify which products are
These challenges are considerable but increasing in share. This gives them a better
surmountable. This article discusses the benefits of understanding of whether traffic is declining
tapping external-data sources, illustrated through or growing, as well as insights into cross-
a variety of examples, and lays out best practices shopping behaviors.

2 Harnessing the power of external data


Exhibit 1
Companies
Companiescan
canobtain
obtaindata from
data many
from types
many of external
types sources.
of external sources.

Geospatial and Point of interest Real estate/ Weather Temperature and precipitation
satellite Footfall property Storms and adverse events
Forecasts

Private Revenues Industry News, IP, News-feed services Patent filings


business Head counts classifications and legal Research-journal feeds Legal actions
Locations Technographics

Consumer Transactions Search trends data Public data Federal filings


Consumer panels Census data State and local filings
Macroeconomic indicators

Web-harvested, Online reviews Web-traffic data Industry specific Trade flows and Healthcare claims
online and app Job and product Digital-app metrics shipping Agriculture/crops
listings Hotel and travel
bookings

— Investors study app downloads and digital activity Three steps to creating value with
to understand how consumer preferences are external data
changing and how effective an organization’s Use of external data has the potential to be game
digital strategy is relative to that of its peers. For changing across a variety of business functions
instance, app downloads, activity, and rating data and sectors. The journey toward successfully
can provide a window into the success rates of the using external data has three key steps.
myriad of live-streaming exercise offerings that
have become available over the last year. 1. Establish a dedicated team for external-data
sourcing
Corporations have also started to explore how they To get started, organizations should establish a
can derive more value from external data (Exhibit dedicated data-sourcing team. In our experience,
2). For example, a large insurer transformed its core a key role on this team is a dedicated data
processes, including underwriting, by expanding scout or strategist who partners with the data-
its use of external-data sources from a handful to analytics team and business functions to identify
more than 40 in the span of two years. The effort operational, cost, and growth improvements that
involved was considerable; it required prioritization could be powered by external data. This person
from senior leadership, dedicated resources, and also would be responsible for building excitement
a systematic approach to testing and applying new around what can be made possible through the
data sources. The hard work paid off, increasing use of external data, planning the use cases to
the predictive power of core models by more than focus on, identifying and prioritizing data sources
20 percent and dramatically reducing application for investigation, and measuring the value
complexity by allowing the insurer to eliminate many generated through use of external data. Ideal
of the questions it typically included on customer candidates for this role are individuals who have
applications. served as analytics translators and who have

Harnessing the power of external data 3


Exhibit 2
External data can help companies create value in several key areas.
External data can help companies create value in several key areas.

Identify ideal B2B prospects and look-a-likes by leveraging firmographics,


employment growth, technographics, retirement-plan investments, etc
Customer analytics
Identify fast-growing consumer trends and marketing opportunities by utilizing
search data, social-media analysis, transaction panels, and receipt panels

Benchmark organizational talent against peers by analyzing existing talent


profiles and job postings
Strategic analysis
Identify product-improvement opportunities by analyzing reviews across
social-media and e-commerce platforms

Predict how real-estate prices will change, based on local market


Operations and characteristics and demographic shifts
forecasting Forecast which customer segments will grow, using firmographics,
technographics, and private-company data

Reduce operational risks, based on real-time analysis of news and social-media


data for raw-material suppliers
Risk management
Reduce supplier and reputational risks by understanding parent-subsidiary
relationships, ownership, news, and legal proceedings

experience in deploying analytics use cases and in data-review roles essential peripheral team
working with technology, business, and analytics members. Data reviewers, who typically include
profiles. legal, risk, and business leaders, should
thoroughly vet new consumer data sets—for
The other team members, who should be drawn example, financial transactions, employment
from across functions, would include purchasing data, and cell-phone data indicating when and
experts, data engineers, data scientists and where people have entered retail locations. The
analysts, technology experts, and data-review- vetting process should ensure that all data were
board members (Exhibit 3). These team members collected with appropriate permissions and will
typically spend only part of their time supporting be used in a way that abides by relevant data-
the data-sourcing effort. For example, the data privacy laws and passes muster with consumers.
analysts and data scientists may already be
supporting data cleaning and modeling for a This team will need a budget to procure small
specific use case and help the sourcing work exploratory data sets, establish relationships
stream by applying the external data to assess its with data marketplaces (such as by purchasing
value. The purchasing expert, already well versed trial licenses), and pay for technology
in managing contracts, will build specialization requirements (such as expanded data storage).
on data-specific licensing approaches to support
those efforts. 2. Develop relationships with data
marketplaces and aggregators
Throughout the process of finding and using While online searches may appear to be an easy
external data, companies must keep in mind way for data-sourcing teams to find individual
privacy concerns and consumer scrutiny, making data sets, that approach is not necessarily the

4 Harnessing the power of external data


Exhibit 3
Aneffective
An effective data-sourcing
data-sourcing team
team combines
combines six roles,
six roles, including
including data and
data scouts scouts
dataand
data reviewers.
reviewers.
Purchasing experts contract, license, and New key roles
negotiate with data providers Understood roles

Data scouts/strategists
partner with business to find, Data reviewers
review, assess, and manage evaluate use, risk, and
external-data assets collection of data

Data scientists and analysts Architects and


apply data to use cases, DevOps engineers
measuring uplift/value in develop platform to integrate
models and analyses and manage access to data

Data engineers ingest, prepare, and


apply data in use cases

most effective. It generally leads to a series of auspices of a single contract and negotiation. Since
time-consuming vendor-by-vendor discussions these external-data distributors have already
and negotiations. The process of developing profiled many data sources, they can be valuable
relationships with a vendor, procuring sample thought partners and can often save an external-
data, and negotiating trial agreements often takes data team significant time. When needed, these
months. data distributors can also help identify valuable data
products and act as the broker to procure the data.
A more effective strategy involves using data-
marketplace and -aggregation platforms that Once the team has identified a potential data set,
specialize in building relationships with hundreds the team’s data engineers should work directly
of data sources, often in specific data domains— with business stakeholders and data scientists to
for example, consumer, real-estate, government, evaluate the data and determine the degree to which
or company data. These relationships can give the data will improve business outcomes. To do so,
organizations ready access to the broader data data teams establish evaluation criteria, assessing
ecosystem through an intuitive search-oriented data across a variety of factors to determine whether
platform, allowing organizations to rapidly test the data set has the necessary characteristics for
dozens or even hundreds of data sets under the delivering valuable insights (Exhibit 4).

Harnessing the power of external data 5


Exhibit 4
A thorough evaluation of external data explores criteria in ten areas.
A thorough evaluation of external data explores criteria in ten areas.

Depth and breadth Match/fill rates Data profile Coverage/panel Timeliness of data
of data Are the match rates Does the data profile overview How far back in time is
Does the data set high enough to describe the Does the data set the data set reliable?
have relevant data justify a return on distribution of the data, cover the right Is history needed?
elements for its main investment? frequencies by column, geography and How frequently are
use cases? missing variables, and population? the data updated?
Are there additional changes in variables Does the way the What is the delay?
fields that could be and panel over time? data are sourced
useful or apply to How does the data set introduce any bias
other areas of the compare with ground in the data, relative
business? truth? to the use case?

Data delivery Potential impact/lift Total cost Procurement and Risk


How are the data Can the proposed or What is the total cost contracting Is the data set sourced
delivered? planned analysis be for the data in How quickly will you appropriately and
Are application completed? evaluation and be able to onboard ethically?
programming Is there an evaluation production? the data set when Is the data set being
interfaces (APIs) data set to model Do you need needed? used appropriately?
available? What is against? engineering resources Who else needs to Is there reputational
the format? What impact does the to make the data review the agree- risk?
How do you know data generate? accessible? ment and technology?
the data have trans- What representations,
ferred successfully? warranties, etc do you
need from the vendor?

Data assessments should include an examination 3. Prepare the data architecture for new
of quality indicators, such as fill rates, coverage, external-data streams
bias, and profiling metrics, within the context of Generating a positive return on investment
the use case. For example, a transaction data from external data calls for up-front planning, a
provider may claim to have hundreds of millions of flexible data architecture, and ongoing quality-
transactions that help illuminate consumer trends. assurance testing.
However, if the data include only transactions
made by millennial consumers, the data set will Up-front planning starts with an assessment of
not be useful to a company seeking to understand the existing data environment to determine how
broader, generation-agnostic consumer trends. it can support ingestion, storage, integration,

6 Harnessing the power of external data


governance, and use of the data. The assessment quality framework to identify whether the source
covers issues such as how frequently the data come data have changed and to understand the drivers
in, the amount of data, how data must be secured, of any changes (for example, schema updates,
and how external data will be integrated with expansion of data products, change in underlying
internal data. This will provide insights about any data sources). If the changes are significant,
necessary modifications to the data architecture.¹ algorithmic models leveraging the data may need
to be retrained or even rebuilt.
Modifications should be designed to ensure that
the data architecture is flexible enough to support
the integration of a continuous “conveyor belt” of
incoming data from a variety of data sources—for Minimizing risk and creating value with external
example, by enabling application-programming- data will require a unique mix of creative problem
interface (API) calls from external sources along solving, organizational capability building, and
with entity-resolution capabilities to intelligently laser-focused execution. That said, business
link the external data to internal data. In other cases, leaders who demonstrate the achievements
it may require tooling to support large-scale data possible with external data can capture the
ingestion, querying, and analysis. Data architecture imagination of the broader leadership team and
and underlying systems can be updated over time as build excitement for scaling beyond early pilots
needs mature and evolve. and tests. An effective route is to begin with a
small team that is focused on using external data
The final process in this step is ensuring an to solve a well-defined problem and then use that
appropriate and consistent level of quality by success to generate momentum for expanding
constantly monitoring the data used. This involves external-data efforts across the organization.
examining data regularly against the established

Mohammed Aaser is McKinsey’s chief data officer, based in the Minneapolis office, and Doug McElhaney
is a partner in the Washington, DC, office.

The authors wish to thank Kelly Brennan and Aqil Datoo for their contributions to this article.

Copyright © 2021 McKinsey & Company. All rights reserved.

1
For more, see Antonio Castro, Jorge Machado, Matthias Roggendorf, and Henning Soller, “How to build a data architecture to drive
innovation—today and tomorrow,” June 2020, McKinsey.com.

Harnessing the power of external data 7

You might also like