Professional Documents
Culture Documents
Harnessing The Power of External Data: Technology
Harnessing The Power of External Data: Technology
© Getty Images
February 2021
Many companies have made great strides in for getting started. These include establishing
collecting and utilizing data from their own activities. an external-data strategy team and developing
So far, though, comparatively few have realized relationships with data brokers and marketplace
the full potential of linking internal data with data partners. Company leaders, such as the
provided by third parties, vendors, or public data executive sponsor of a data effort and a chief
sources. Overlooking such external data is a missed data and analytics officer, and their data-focused
opportunity. Organizations that stay abreast of teams should also learn how to rigorously
the expanding external-data ecosystem and evaluate and test external data before using and
successfully integrate a broad spectrum of external operationalizing the data at scale.
data into their operations can outperform other
companies by unlocking improvements in growth,
productivity, and risk management. External-data success stories
Companies across industries have begun
The COVID-19 crisis provides an example of just how successfully using external data from a variety
relevant external data can be. In a few short months, of sources (Exhibit 1). The investment community
consumer purchasing habits, activities, and digital is a pioneer in this space. To predict outcomes
behavior changed dramatically, making preexisting and generate investment returns, analysts
consumer research, forecasts, and predictive and data scientists in investment firms have
models obsolete. Moreover, as organizations gathered “alternative data” from a variety of
scrambled to understand these changing patterns, licensed and public data sources, many of which
they discovered little of use in their internal data. draw from the “digital exhaust” of a growing
Meanwhile, a wealth of external data could—and number of technology companies and the public
still can—help organizations plan and respond at a web. Investment firms have established teams
granular level. that assess hundreds of these data sources and
providers and then test their effectiveness in
Although external-data sources offer immense investment decisions.
potential, they also present several practical
challenges. To start, simply gaining a basic A broad range of data sources are used, and
understanding of what’s available requires these inform investment decisions in a variety
considerable effort, given that the external-data of ways:
environment is fragmented and expanding quickly.
Thousands of data products can be obtained — Investors actively gather job postings,
through a multitude of channels—including company reviews posted by employees,
data brokers, data aggregators, and analytics employee-turnover data from professional
platforms—and the number grows every day. networking and career websites, and patent
Analyzing the quality and economic value of data filings to understand company strategy
products also can be difficult. Moreover, efficient and predict financial performance and
usage and operationalization of external data may organizational growth.
require updates to the organization’s existing data
environment, including changes to systems and — Analysts use aggregated transaction
infrastructure. Companies also need to remain data from card processors and digital-
cognizant of privacy concerns and consumer receipt data to understand the volume of
scrutiny when they use some types of external data. purchases by consumers, both online and
offline, and to identify which products are
These challenges are considerable but increasing in share. This gives them a better
surmountable. This article discusses the benefits of understanding of whether traffic is declining
tapping external-data sources, illustrated through or growing, as well as insights into cross-
a variety of examples, and lays out best practices shopping behaviors.
Geospatial and Point of interest Real estate/ Weather Temperature and precipitation
satellite Footfall property Storms and adverse events
Forecasts
Web-harvested, Online reviews Web-traffic data Industry specific Trade flows and Healthcare claims
online and app Job and product Digital-app metrics shipping Agriculture/crops
listings Hotel and travel
bookings
— Investors study app downloads and digital activity Three steps to creating value with
to understand how consumer preferences are external data
changing and how effective an organization’s Use of external data has the potential to be game
digital strategy is relative to that of its peers. For changing across a variety of business functions
instance, app downloads, activity, and rating data and sectors. The journey toward successfully
can provide a window into the success rates of the using external data has three key steps.
myriad of live-streaming exercise offerings that
have become available over the last year. 1. Establish a dedicated team for external-data
sourcing
Corporations have also started to explore how they To get started, organizations should establish a
can derive more value from external data (Exhibit dedicated data-sourcing team. In our experience,
2). For example, a large insurer transformed its core a key role on this team is a dedicated data
processes, including underwriting, by expanding scout or strategist who partners with the data-
its use of external-data sources from a handful to analytics team and business functions to identify
more than 40 in the span of two years. The effort operational, cost, and growth improvements that
involved was considerable; it required prioritization could be powered by external data. This person
from senior leadership, dedicated resources, and also would be responsible for building excitement
a systematic approach to testing and applying new around what can be made possible through the
data sources. The hard work paid off, increasing use of external data, planning the use cases to
the predictive power of core models by more than focus on, identifying and prioritizing data sources
20 percent and dramatically reducing application for investigation, and measuring the value
complexity by allowing the insurer to eliminate many generated through use of external data. Ideal
of the questions it typically included on customer candidates for this role are individuals who have
applications. served as analytics translators and who have
experience in deploying analytics use cases and in data-review roles essential peripheral team
working with technology, business, and analytics members. Data reviewers, who typically include
profiles. legal, risk, and business leaders, should
thoroughly vet new consumer data sets—for
The other team members, who should be drawn example, financial transactions, employment
from across functions, would include purchasing data, and cell-phone data indicating when and
experts, data engineers, data scientists and where people have entered retail locations. The
analysts, technology experts, and data-review- vetting process should ensure that all data were
board members (Exhibit 3). These team members collected with appropriate permissions and will
typically spend only part of their time supporting be used in a way that abides by relevant data-
the data-sourcing effort. For example, the data privacy laws and passes muster with consumers.
analysts and data scientists may already be
supporting data cleaning and modeling for a This team will need a budget to procure small
specific use case and help the sourcing work exploratory data sets, establish relationships
stream by applying the external data to assess its with data marketplaces (such as by purchasing
value. The purchasing expert, already well versed trial licenses), and pay for technology
in managing contracts, will build specialization requirements (such as expanded data storage).
on data-specific licensing approaches to support
those efforts. 2. Develop relationships with data
marketplaces and aggregators
Throughout the process of finding and using While online searches may appear to be an easy
external data, companies must keep in mind way for data-sourcing teams to find individual
privacy concerns and consumer scrutiny, making data sets, that approach is not necessarily the
Data scouts/strategists
partner with business to find, Data reviewers
review, assess, and manage evaluate use, risk, and
external-data assets collection of data
most effective. It generally leads to a series of auspices of a single contract and negotiation. Since
time-consuming vendor-by-vendor discussions these external-data distributors have already
and negotiations. The process of developing profiled many data sources, they can be valuable
relationships with a vendor, procuring sample thought partners and can often save an external-
data, and negotiating trial agreements often takes data team significant time. When needed, these
months. data distributors can also help identify valuable data
products and act as the broker to procure the data.
A more effective strategy involves using data-
marketplace and -aggregation platforms that Once the team has identified a potential data set,
specialize in building relationships with hundreds the team’s data engineers should work directly
of data sources, often in specific data domains— with business stakeholders and data scientists to
for example, consumer, real-estate, government, evaluate the data and determine the degree to which
or company data. These relationships can give the data will improve business outcomes. To do so,
organizations ready access to the broader data data teams establish evaluation criteria, assessing
ecosystem through an intuitive search-oriented data across a variety of factors to determine whether
platform, allowing organizations to rapidly test the data set has the necessary characteristics for
dozens or even hundreds of data sets under the delivering valuable insights (Exhibit 4).
Depth and breadth Match/fill rates Data profile Coverage/panel Timeliness of data
of data Are the match rates Does the data profile overview How far back in time is
Does the data set high enough to describe the Does the data set the data set reliable?
have relevant data justify a return on distribution of the data, cover the right Is history needed?
elements for its main investment? frequencies by column, geography and How frequently are
use cases? missing variables, and population? the data updated?
Are there additional changes in variables Does the way the What is the delay?
fields that could be and panel over time? data are sourced
useful or apply to How does the data set introduce any bias
other areas of the compare with ground in the data, relative
business? truth? to the use case?
Data assessments should include an examination 3. Prepare the data architecture for new
of quality indicators, such as fill rates, coverage, external-data streams
bias, and profiling metrics, within the context of Generating a positive return on investment
the use case. For example, a transaction data from external data calls for up-front planning, a
provider may claim to have hundreds of millions of flexible data architecture, and ongoing quality-
transactions that help illuminate consumer trends. assurance testing.
However, if the data include only transactions
made by millennial consumers, the data set will Up-front planning starts with an assessment of
not be useful to a company seeking to understand the existing data environment to determine how
broader, generation-agnostic consumer trends. it can support ingestion, storage, integration,
Mohammed Aaser is McKinsey’s chief data officer, based in the Minneapolis office, and Doug McElhaney
is a partner in the Washington, DC, office.
The authors wish to thank Kelly Brennan and Aqil Datoo for their contributions to this article.
1
For more, see Antonio Castro, Jorge Machado, Matthias Roggendorf, and Henning Soller, “How to build a data architecture to drive
innovation—today and tomorrow,” June 2020, McKinsey.com.