Professional Documents
Culture Documents
The Programmatic Primer Data Management
The Programmatic Primer Data Management
In this collection of articles the various elements of the online advertising ecosystem are explained. The
articles enable advertisers to take advantage of the opportunities offered by programmatic advertising
through understanding the companies and processes involved. This article focuses on DMP's, data
suppliers and data aggregators.
This article is part of The Programmatic Primer, Warc's essential guide for advertisers to the latest online
advertising techniques. The report includes detailed advice on how to work with companies in the online
advertising ecosystem, plus use cases of programmatic in action. View a PDF version.
If you've been following along in the hymnal, you already know that in the display ecosystem, our goal is to give
the right message to the right consumer. You also know that a web page can give the consumer an ad via some
combination of ecosystem players. The party who decides they need this consumer's attention right now should
be willing to bid high. But how do they know how much to bid, or whether to bid at all?
The answer is: because they know something. That's the role of data. A consumer ID of some sort, often a
cookie, anchors data organization in databases that contain consumer profiles, and becomes the basis to
evaluate an opportunity to place a creative.
The Holy Grail of data management is a 'persistent cross device ID'. In other words, a person. Cookies can be
deleted, and people get new mobile phones and computers. When the identification goes dark, so does all the
data that goes with it. So, when you delete your cookies, for example, all the data that shows the ecosystem
about your behaviors remains in databases, but no one will ever get a match to a deleted cookie. For this
reason, companies often delete any cookie over a certain age. If an ID is persistent, as it is with Facebook, or
Yahoo (anywhere you are registered), it is valid for a long time, and thus can be the basis for accumulating a
long history of behaviors. Note that these registration-based IDs are often not shared, or shareable outside of
the company that has the registration, and so cannot be used to target media at other publishers. Facebook has
found a way around this using the Atlas ad serving infrastructure to mask identities, but one has to wonder how
much enthusiasm they have for improving the performance of their competitors.
Data is generally data about how browsers behave. For example, browser #1438954 visited Amazon and went
to a page about baby care products. Only companies with access to that browser at that moment (tags) can get
that little tidbit of information. No one can follow you. Cookies don't know who you are and they cannot find you.
They can only recognize you. Cookies enable a computer to recognize your browser if your impression arrives
by chance at a server that knows your cookie. Here's a simple metaphor.
If I hired a detective to track you and all they could say in the end was: "They went to a jewelry store, maybe a
bunch of other places I don’t know about, and I don't know their name or address, and I can't contact them
unless they arrive by coincidence at the corner of 5th and 49th street today, and if they do, all I can do then is
show them a picture, and I am only 60% sure they are male," I would fire that detective. That's all so-called
'tracking' can do. It's unreliable, but much better than a wild guess. Data simply raises the odds, to a greater or
lesser extent, that you will give the right message to a targeted population. With 'retargeting', the odds are much
better because the targeting ties directly to a specific behavior, rather than to an inferred attribute – but you still
have to hope they will appear in an impression to which you have visibility (via a bid request, normally).
Cookies are not new. They were written into the web protocols as a way to bind transactions together. For
example, when you click 'add to cart' on Amazon, it needs to know which one of the millions of virtual carts
containing goods at this moment. The cookie was designed to make that connection. The use of a cookie for ad
targeting was not the intent of the people who invented the web.
By convention, a domain (the .com) only gets a cookie from a browser if it wrote that cookie in the first place.
Thus, if you're surfing at Pantene.com, they can set a cookie, but if you cruise over to Amazon, they cannot see
that you were at Pantene unless someone sold them the data.
Data can be bought and sold despite that limitation because in the rendering of any page, a dozen different
domains may be involved. This allows 'cookie sync' among ecosystem players and data companies so that one
party’s profile database can get a match on an element of data that may have originated with another domain.
The process of creating equivalence among cookies from different domains is sometimes called 'domain space
resolution' (impress your friends with that one). That process is regarded as the tricky part, even among the
technorati.
If a server involved in a page operation is not in the domain in the address bar, it is said to be a 'third party'. The
server has been called into duty by a page for some task, such as placing an ad or showing an image. Because
of this, several domains' cookies may understand something about your browser's past behavior. If you want to
see who is collecting data from you, install a free plugin from Evidon called Ghostery, which shows you which
domains or companies are collecting data. Ghostery can give you an indication of how ecosystem-agile a
publisher has become, or how they measure success, or where they get data.
Figure 1 is a screenshot showing Ghostery's output for the Forbes.com homepage. The blue box on the right
shows tags running on the Forbes homepage.
Not all the companies in the list are taking cookies, but they are all collecting data with the permission of the
publisher. From Ghostery you can tell a lot of things about Forbes instantly. It deals with comScore, Turn (a
DSP), it runs Twitter widgets, and much more. The appearance of the DSP tags means only that the DSP
bought an ad on that page, and the appearance of the AppNexus tag suggests that at least one of them bought
the ad on an exchange. So we can learn a lot about a publisher simply from the data collection tags observed by
Ghostery.
Data can be used for applications beyond display advertising. Today, the ecosystem can even put photos on
your website that correspond to past behaviors of the current visitor. For instance, x+1 (now absorbed in Rocket
Fuel) helps Delta Airlines place offers on its website that correspond to your potential travel desires. So Delta is
customizing its own website in real-time, one-to-one, based on behavioral data it got from other websites, or its
own. A display ad is just an object – a picture, or a video, Script, or a bit of text. There's no reason we can't use
the ecosystem to make all the content on a web page dynamic and customized.
Data suppliers
Data suppliers tend to specialize in particular data sources. Experian gets credit data (of course, it does not
come with a cookie attached). DataLogix (now absorbed by Oracle) gets purchase data from retail loyalty
programs.
To some extent, offline data such as loyalty data can be associated with an online identifier, such as a cookie,
using a process called onramp. When grocery data is onramped, for example, it is possible to cookie-target
consumers on the basis of their purchases. This practice has some privacy issues, but with double-blind
matching practices, the consumer's identity is safe. DataLogix and Live Ramp are the leaders in this field, but
any DMP, network or DSP can arrange to have data onramped if the underlying sources are available for
matching. Facebook uses onramped shopping data as a measurement tool to help CPG companies tell if
Facebook media resulted in offline purchases.
So, the possibilities are (once again) endless. Basically, any consumer behavior that leaves a data trail can
become a signal to generate an advertisement. Brands can do research to see what behaviors signal "entry" to
their marketplace, and target them directly.
Some big sources of data supply are not apparent to observers. Suppliers of publisher tools, for example, put
code, normally with a tag or widget, on a page (any page) with the permission of the publisher. The code sets a
cookie in the collector's domain, and collects data such as the page the browser visited, or the time spent on the
page, or the titles of the articles on the page, or the keywords on the page – you name it. If a JavaScript can see
the data, it can be harvested. These companies (for example, Add This) sell it directly to data aggregators.
Data aggregators
One problem with all the data collected by the collection companies is that it's hard to know what it means. I just
checked the weather; does that mean anything about me? Doubtful. Data aggregators (such as Exelate and
Blue Kai) are about deriving meaning. The big aggregators, like Blue Kai, ascribe meaning to web behaviors.
They sell segments (a list of cookies) that indicate useful information about a browser’s likely preferences. Want
'Auto-intenders', 'Canadian females', or 'Beauty interested'? They got 'em. DMPs and data aggregators often
classify data with terms such as 'intention data', or 'interest data'. This helps advertisers match messaging to
targeting.
Depending on who you ask, the meanings can be a stretch. Do 'Auto-intenders' really want to buy a car? Are
'females' really female? (Only about 65% of cookies labeled as female actually are). It is in the interest of data
aggregators to be, shall we say, generous in their inferences. For example, the cookies of 'Auto-intenders' are
valuable to a car company, so data aggregators make more money when they infer that you are an 'Auto-
intender'.
Buyers should understand the pedigree of the data they are buying or risk buying data that is not what it seems
to be.
All this data ends up in a consumer profile, anchored by a cookie or other ID, and is subject to becoming
obsolete if the consumer deletes their cookies or changes devices. A good DMP or aggregator might expire any
cookie over 12 weeks old. Remember that you can't get a match on a deleted cookie because a bid request will
never contain a cookie that’s deleted. Simply, you will buy the data and get no reach. Some companies now
charge for data use only if the data point provides real reach. This is a significant negotiating point in a deal with
any data provider, but difficult to account for.
In addition to a cookie, every browser and phone contains a 'fingerprint'. This is a composite of elements that
can be gathered by a script. It's a reasonable facsimile for a persistent ID but, of course, does not span across
devices. Thus, with a fingerprint, the ecosystem thinks I'm a different person, per computer I use. The fingerprint
can be used to circumvent some of the problems with cookie deletion and re-establish that the consumer profile
formerly belonging to cookie 123 is now cookie 456. The fingerprint bridges the gap and thus preserves the
profile.
How reliable is data? It's easy to see what data aggregators think you are. Just go to
www.bluekai.com/registry and check. You might be surprised at how accurate they are in some aspects, and
how totally wrong in others.
If that seems complicated, keep in mind that identity, even anonymous identity, is underlying a $30 billion
industry that is generating huge return for advertisers. Persistent identity is the gold in the programmatic gold
rush. Who has the most gold? The publishers that know your name or email address. Those don't change much,
so they rarely have to throw away data.
DMPs can integrate data from any source to provide other decision criteria for bidders, as long as they get an ID
match. For example, they have data relationships in which they can use retail loyalty data to match a cookie with
shopping behavior (onramping, as previously described).
So, DMPs tend to be suppliers to anyone on either side of the marketplace who needs data for use in targeting.
Regarding how to evaluate DMP services, the only sure way for an advertiser to know if a DMP can do what is
needed is to be very specific about what that is.
Differentiation is a big issue for DMPs. Key differentiators are how data can be used and extracted, external
data relationships, ability to syndicate out (integrations with those using the data), self serve interfaces,
uniqueness and quality of segments they sell, etc.
How should an advertiser pay for data? Maybe they should pay only for data that caused an impression to be
served, or maybe they should pay for 'segments', even if some of the cookies are deleted. In either case, how
would you account for that? In many scenarios, the targeting data is worth more than the media impression that
was targeted using the data.
DMPs are in the business of providing targeting data, which is, in effect, measurement. Note that measurement
suppliers (such as Vizu) can now use cookie targeting to recruit for surveys. So, for example, you can ask
questions of people who bought a particular product by onramping shopping data. Or ask people who searched
for 'stereo system' about their listening habits and use the information to design speakers. There is simply no
limit except the imagination of the marketer. Any DMP will talk to you, and they all understand the ecosystem –
because data is central.
Some key capabilities that differentiate DMPs are their ability to push data to other platforms (extract data), and
to develop custom audiences based on complex criteria. An example would be households that contain Hispanic
teens in a list of zip codes, and consume certain types of media. Other differentiators might be the ability to
anticipate reach for a segment, or the ability to get cookie sync with data provided by the advertiser.
The current trend (2013–2015) is the assimilation of DMPs by other companies. Basically, there are only a
couple of independent DMPs left. Rocket fuel bought x+1, Nielsen bought Exelate, Oracle bought Blue Kai, and
DataLogix, etc. On the part of the acquirers, the trend reflects what we are saying here: Data is fundamental and
strategic. Thus, having some key data capabilities in house is critical. Remaining un-acquired are some players
that serve narrow markets (Lotame), and new entrants (Krux). Because barriers to entry for data companies are
low (except for relationships that garner footprint), there are and always will be dozens of smaller start-ups. Just
remember that if you use a DMP associated with another player … for example a DSP, there will be several
forces that push you toward buying the media services that are closely integrated with the data. There are good
reasons that tight integration with data delivers better media; less lost bids, ability to manage frequency better,
proprietary data models driving your media, etc. However, when captive, a DMP may lose some scale; and scale
is the economic driver underlying data efficiency. In other words, a DMP may become less profitable after being
acquired, and therefore less able to innovate, etc.
One thing to remember is that as data becomes more strategic, and technology cheaper, profit will inevitably
flow toward the original source of data.
Gian Fulgoni:
To maximize ROI, think beyond programmatic
Dave Morgan:
Web ad technology will soon reshape TV advertising
Dominic Trigg:
Real time branding to drive programmatic buying
Publishers
Ad Networks
Ad Servers
Exchanges
Trading Desks
Measurement
Retargeting
Creative Optimization
Tag Management
More on Warc
Read more on programmatic on Warc. Recent highlights include:
Ted McConnell works with most segments of the marketing communications industry. He serves on boards for
web technology startups, works as EVP Digital for the Advertising Research Foundation, is retained by high tech
media companies to drive product innovation, and helps large enterprises develop their digital marketing
strategies and processes. His main focus areas today are ways to eliminate waste in digital media,
programmatic TV buying via the online display ecosystem, and ways and means for enterprises to scale content
marketing. He spent 15 years presiding over digital marketing innovation at P&G, partnering with brands to put
new marketing models to work.
www.warc.com
All rights reserved including database rights. This electronic file is for the personal use of authorised users based at the subscribing
company's office location. It may not be reproduced, posted on intranets, extranets or the internet, e-mailed, archived or shared electronically
either within the purchaser's organisation or externally without express written permission from Warc.