You are on page 1of 55

Amobee Presents

DataMine
Basic
Datasets, Use Cases & Syntax
Amobee DataMine Training

What is DataMine?

Amobee’s query-able data warehouse and analytics tool


• Stores transaction and user profile data
• DataMine Query Language (aka CQL) incorporates the simplicity of
SQL while allowing for more complicated data analysis without the
necessity of complex joins

Why use DataMine?


• Make your own reports!

2
Amobee DataMine Training

The questions DataMine can help answer

What is performance, spend, and scale by all metrics


or breakdowns?
• “Performance for the past 2 days has been poor. Should I take action
now or wait?”
• “This package is performing poorly and the reason isn’t clear in the
console. What’s my next step?”
• “Are there certain sites that I should be blocking or targeting depending
on performance?”
• “At what times of day and on what days of the week am I performing
best/worst?”
• “How does my performance vary across regions?”

3
Amobee DataMine Training

The DataMine query tool is accessed in the Amobee console


and has the look and feel of SQL...but it’s not really SQL

Quick Demo

4
Amobee DataMine Training

Amobee’s Technology
Platform Overview

5
Amobee DataMine Training

Bid Requests
Impressions
Clicks

Actions

DSP

6
Amobee DataMine Training

Bid Requests
Impressions
1st
Clicks

Actions
2nd

DMP DSP
3rd

7
Amobee DataMine Training
Site Visitation CRM
Bid Requests
Impressions
1st
Clicks

Publisher Actions
2nd Targeting

DMP DSP
Demo Media

3rd
Interest

In Market
Etc.

8
Amobee DataMine Training
Site Visitation CRM
Bid Requests
Impressions
1st
Clicks

Publisher Actions
2nd Targeting

DMP DSP
Demo Media

3rd
Interest

In Market UID
Etc.

9
Amobee DataMine Training
Site Visitation CRM
Bid Requests
Impressions
1st
Clicks

Publisher Actions
2nd Targeting

DMP DSP
Demo Media

3rd
Interest

Da

ta
Da
ta
In Market UID
Etc.

DataMine

Information

10
Amobee DataMine Training
Site Visitation CRM
Bid Requests
Impressions
1st
Clicks

Publisher Actions
2nd Targeting

DMP DSP
Demo Media

3rd
Interest

Da

ta
Da
ta
In Market UID
Etc.

DataMine

Robust Consumer Profile


Device Third-Party Segment Site Visit Conversion Email CRM Segment

Mobile Soccer Mom Yahoo.com Video Opened Loyalist


148589375
Time Stamp 2015-01-16T22:22:36+03:40 2014-12-13T22:23:58+80:09 2015-02-10T22:23:58+00:02 2015-01-16T22:22:36+03:03 2015-01-16T22:22:36+03:02

11
Amobee DataMine Training

Data Storage
From a High Level

12
Amobee DataMine Training

Campaign data is stored either of two places; the ‘ICA’ tables


and ‘DMP’; audience data is exclusively in the DMP
Campaign Data Audience Data

The “ICA” Tables The “DMP” Tables


User-Profile Level Data:
Impression-Event Level Data
(Campaigns, 1st Party, 3rd Party, Site)

Impressions, Clicks, Actions DMP table names preceded by “dmp.“…


Campaign Data Audience Data
Impressions Data
Viewability Provider_user_ids
Actions Beacons
Engagement_events Segments

13
Amobee DataMine Training

What does a user profile contain?

TIMESTAMP DATA TYPE DEVICE PROVIDER CATEGORY ACTION


2018-02-16T22:22:36+40:06 1st Party Data Offline Ann Taylor - Site Visits CRM Segment - Loyalist Thank You Page Visit
2018-02-13T22:23:58+50:02 1st Party Data Offline Ann Taylor - Site Visits CRM Segment - High Volume Purchaser Shopping Cart Visit
2018-02-14T22:24:54+20:00 Amobee Data Desktop Amobee Yahoo.com Site Visit
2018-01-16T22:22:36+50:07 Amobee Data Mobile Amobee Mobile Device Received Impression
2017-12-13T22:23:58+80:09 Partner Data Desktop LOFT - Maxymiser LOFT In-Store LTV $1000+ Shoes Page Visit
2017-12-14T22:24:54+07:02 Partner Data Mobile LOFT - Hybris LOFT In-Store Childrens $500+ Abandon No Purchase
2018-01-14T22:22:36+00:60 Partner Data Desktop LOFT - Maxymiser LOFT In-Store LTV $1000+ Purchase
2018-02-10T22:23:58+00:02 1st Party Data Tablet LOFT - Website Ann Taylor In-Store Purchase Q4 2013 Purchase - Mobile
2018-02-12T22:24:54+10:04 1st Party Data Mobile LOFT - Flextag LOFT In-Store Childrens $500+ Purchase
8 6 7 5 3 0 9 2018-01-01T22:22:36+00:05 3rd Party Data Desktop IXI Financial Data Household Income $80k-$100K Video
2018-02-13T22:23:58+20:04 3rd Party Data Desktop IXI Financial Data LOFT In-Store LTV $1000+ Purchase
2018-02-14T22:23:58+20:05 3rd Party Data Desktop Acxiom Urban Fashionistas Download - Catalog
2018-02-11T22:24:54+08:10 3rd Party Data Tablet DataLogix – Political Affiliation Democratic Email Open
2018-01-16T22:22:36+03:40 Partner Data Desktop MediaMath Digital Campaign Display Ad
2017-12-13T22:23:58+06:06 3rd Party Data Desktop DataLogix – LifeStyle Affiliation Gadget Geeks Display Ad
2017-11-14T22:24:54+05:08 2nd Party Data Desktop Gymboree Fashionista Email Open
2018-01-16T22:22:36+03:02 2nd Party Data Mobile Gilt Cocktail Dress Intender Mobile App Ad
2018-01-16T22:22:36+03:03 2nd Party Data Desktop Nine West Household 25-54, Shoe Shopper LTV Customer

14
Amobee DataMine Training

Schema Distinction

15
Amobee DataMine Training

Schema distinction

In the Campaign
In the User Profiles
Performance schema,
scheme, the atomic level
the atomic level of detail is
of detail is the user.
the impression.

16
Amobee DataMine Training

Schema distinction

In the Campaign Performance schema, we can


only group and filter impressions according to
features of those impressions.

17
Amobee DataMine Training

Schema distinction

• Which IO served the impression?

In the Campaign Performance • Which package served the impression?

schema, we can only group and • Which creative was served for this impression?

filter impressions according to • Which site was the impression served on?

features of those impressions. • Is there a click associated with this impression?

• Is there an action associated with the impression?

18
Amobee DataMine Training

Schema distinction

In the Campaign • Which IO served the impression? insertion_order_name

Performance schema, • Which package served the impression? package_name

we can only group and • Which creative was served for this impression? creative_name

filter impressions • Which site was the impression served on? tld

according to features of • Is there a click associated with this impression? click -> sum(click)

those impressions. • Is there an action associated with the impression? action -> sum(action)

19
Amobee DataMine Training

Schema distinction

SELECT
insertion_order_name
, package_name
, creative_name
In the Campaign Performance schema, we , tld
, sum(click)
can only group and filter impressions , sum(action)
FROM
according to features of those impressions. impressions, clicks, actions
DATES
last_30_days
WHERE
insertion_order_id = 123456789

20
Amobee DataMine Training

Schema distinction

In the Campaign
In the User Profiles
Performance schema, we
scheme, we group and filter
group and filter impressions
users according to the
according to the features of
features of those users.
those impressions.

21
Amobee DataMine Training

Table structure – ICA


Simple relational database structure – entry per impression (processed way faster)

impression_datetime impression click action insertion_order_name package_name …


1397841195 1 0 1 insertion order a package a …
1397841197 1 0 0 insertion order a package b …
1397841198 1 0 0 insertion order a package c …
1397841198 1 1 0 insertion order a package a …
1397841199 1 1 0 insertion order a package d …
1397841200 1 0 1 insertion order b package a …
1397841201 1 0 0 insertion order b package a …
1397841202 1 0 0 insertion order b package a …
1397841203 1 1 1 insertion order b package a …
1397841205 1 0 0 insertion order b package a …
1397841205 1 0 1 insertion order b package a …
1397841206 1 0 0 insertion order b package a …
1397841207 1 1 0 insertion order c package f …
… … … … … … …
22
Amobee DataMine Training

ICA Tables

23
Amobee DataMine Training

Impressions, clicks, action tables


Campaign, user, and ad-call related data for impressions. Action = campaign goal event

Key Dimensions Key Measures

User ID Impression

Device Type/OS Type Clicks

Browser Name Actions

Insertion Order/Package/Line Item Cost

TLD/Subdomain/IAB Category/App Name Timestamp

Zip Code/State Estimate Count Uniques

Brand Safety Category/Contextual Category

Creative Name/Creative Size

24
Amobee DataMine Training

Engagement Events
Video player engagement events

Key Dimensions Key Measures

Engagement Event ID (View, 25% complete, 50% Count


complete, completed view, pause, mute, etc.)

25
Amobee DataMine Training

Viewability
Video viewability metric – MOAT, DV, or IAS (metrics vary by provider).
Display viewability is currently in beta (2%).

Key Dimensions Key Measures

Measurable Impression

Analyzed Impression

In View Impression

AVOC Impressions

Seconds 100% In View

Estimate Count Uniques

26
Amobee DataMine Training

Basic CQL Syntax

27
Amobee DataMine Training

DataMine query structure

SELECT
insertion_order_name io No need for “join on” or “group by”
, package_name package No “as” to give column alias
, SUM(impression) imps At least 1 measure required
, SUM(click) clicks
, SUM(action) actions
, SUM(cost) cost
FROM
impressions,clicks,actions
DATES
[2013_12_01,2013_12_11]
WHERE
insertion_order_id IN (978565872,986574352)
HAVING
SUM(impression)>1000
28
Amobee DataMine Training

Let’s break it apart

SELECT
insertion_order_name io
, package_name package
, SUM(impression) imps
, SUM(click) clicks
, SUM(action) actions
, SUM(cost) cost
FROM
impressions,clicks,actions
DATES
[2013_12_01,2013_12_11]
WHERE
insertion_order_id IN (978565872,986574352)
HAVING
SUM(impression)>1000
29
Amobee DataMine Training

The SELECT clause

SELECT
insertion_order_name io Dimensions
, package_name package
, SUM(impression) imps
, SUM(click) clicks Measures
, SUM(action) actions
, SUM(cost) cost

30
Amobee DataMine Training

The SELECT clause

SELECT Dimensions: How you would like your


insertion_order_name io Dimensions
, package_name package
metrics aggregated/grouped?
, SUM(impression) imps Measures: What metrics do you want to
, SUM(click) clicks see?
Measures
, SUM(action) actions
, SUM(cost) cost “Please send a report that breaks out
impressions and cost by insertion orders
and packages”

“I would like to see what my spent and


action volumes by dates and TLD”

31
Amobee DataMine Training

Common dimensions and measures

insertion_order_name sum(impression)
package_name
sum(click)
line_item_name
sum(action)
creative_name
sum(cost)
tld
dma
region
to_date_2(impression_datetime)
to_day_of_week(impression_datetime)
to_hour(impression_datetime)

32
Amobee DataMine Training

Aggregate data at various levels (be careful)


By IO and Package: By Package and Line Item:
SELECT SELECT
insertion_order_name io, package_name package,
package_name package, line_item_name line_item,
sum(impression) imps, sum(impression) imps,
sum(click) clicks, sum(click) clicks,
sum(action) actions, sum(action) actions,
sum(cost) cost sum(cost) cost
By Package, LI and Exchange: By Package, LI, Exchange and Date:
SELECT SELECT
package_name package, package_name package,
line_item_name line_item, line_item_name line_item,
inventory_source_name exchange, inventory_source_name exchange,
sum(impression) imps, to_date_2(impression_datetime) date,
sum(click) clicks, sum(impression) imps,
sum(action) actions, sum(click) clicks,
sum(cost) cost sum(action) actions,
sum(cost) cost 33
Amobee DataMine Training

The SELECT Clause: Alias

SELECT Giving a field an alias and it will become the name


insertion_order_name io
, package_name package
of your column in your output file. This makes
, SUM(impression) imps your output file a lot cleaner, but be sure to give a
Aliases name that makes sense!
, SUM(click) clicks
, SUM(action) actions
, SUM(cost) cost Note that your alias name must begin with a letter.

34
Amobee DataMine Training

The SELECT Clause: Alias

Without Aliases:
insertion_order_name package_name sum(impression) sum(click) sum(action) sum(cost)
Prospecting Package A 10000 27 1 30
Prospecting Package B 12500 37 3 27
Retargeting Package A 6000 25 5 35

With Aliases:
IO Package Imps Clicks Actions Cost
Prospecting Package A 10000 27 1 30
Prospecting Package B 12500 37 3 27
Retargeting Package A 6000 25 5 35

35
Amobee DataMine Training

The SELECT Clause: Time aggregation


By Month:
to_month(impression_datetime) outputs YYYY_MM Time functions return dates in
“market time zone.” For Kraft, EST.
By Date:
to_date(impression_datetime) outputs YYYY_MM_DD
or
to_date_2(impression_datetime) outputs MM/DD/YYYY
By Day of Week:
to_day_of_week(impression_datetime) outputs Sunday, Monday, Tuesday, etc.
By Hour of Day:
to_hour(impression_datetime) outputs an integer between 0 and 23
Event Level
to_time(impression_datetime) outputs date and time of event down to the second

36
Amobee DataMine Training

The SELECT Clause: Calculation fields

SELECT
insertion_order_name io Dimensions
, package_name package
, SUM(impression) imps
, SUM(click) clicks Measures
, SUM(action) actions
, SUM(cost) cost
, SUM(cost)/sum(action) cpa
, (SUM(cost)/SUM(impression))*1000 cpm Calculations

IO Package Imps Clicks Actions Cost CPA CPM


Prospecting Package A 10000 27 1 30.00 30.00 3.00

Prospecting Package B 12500 37 3 27.00 9.00 2.16

Retargeting Package A 6000 25 5 35.00 7.00 5.83

37
Amobee DataMine Training

The FROM Clause

SELECT
insertion_order_name io • Specifies the tables where your data lives.
, package_name package • Only the tables that contain your fields are necessary
, SUM(impression) imps
, SUM(click) clicks
, SUM(action) actions
, SUM(cost) cost
FROM
impressions,clicks,actions The table(s)

38
Amobee DataMine Training

The DATES Clause

SELECT
insertion_order_name io •Specifies the date range for your query
, package_name package •Can be absolute [yyyy_mm_dd, yyyy_mm_dd]
, SUM(impression) imps
, SUM(click) clicks •or relatives: last_30_days, last_7_days, etc
, SUM(action) actions
, SUM(cost) cost
FROM
impressions,clicks,actions
DATES
[2013_12_01,2013_12_11] Date range

39
Amobee DataMine Training

The WHERE Clause

SELECT
insertion_order_name io •Use WHERE to filter data by a dimension(s)
, package_name package •It is technically optional (but highly encouraged)
, SUM(impression) imps
, SUM(click) clicks •Accepts numeric and string values using = or IN
, SUM(action) actions
, SUM(cost) cost
FROM
impressions,clicks,actions
DATES
[2013_12_01,2013_12_11]
WHERE
insertion_order_id IN (978565872,986574352) Filter data by dimension(s)

40
Amobee DataMine Training

The WHERE Clause

WHERE
insertion_order_id IN (123456789, 987654321) There is no difference
in these three syntax

WHERE
(insertion_order_id = 123456789) OR (insertion_order_id = 987654321)

WHERE
insertion_order_id IN (123456789) OR insertion_order_id IN (987654321)

41
Amobee DataMine Training

The WHERE Clause


= package_id=97865476

IN package_id IN (97865476,97863890)

AND package_id=97865476 AND tld IN (‘nytimes.com’,’whatdobadgerseat.com’)

LIKE tld LIKE ‘%book%’

OR tld LIKE ‘%times%’ OR tld LIKE ‘%badger%’

NOT package_id=97865476 AND NOT tld IN (‘nytimes.com’,’whatdobadgerseat.com’)

42
Amobee DataMine Training

The HAVING Clause

SELECT
insertion_order_name io •Use HAVING to filter by an aggregated measure
, package_name package •Accepts same operators as the WHERE clause
, SUM(impression) imps
, SUM(click) clicks •Optional as well
, SUM(action) actions
, SUM(cost) cost
FROM
impressions,clicks,actions
DATES
[2013_12_01,2013_12_11]
WHERE
insertion_order_id IN (978565872,986574352)
HAVING
SUM(impression)>1000 Filter data by measure(s)
43
Amobee DataMine Training

ICA query example

SELECT Creative Size CPM


creative_size,
sum(cost)/sum(impression)*1000 CPM 970x250 $6.59

FROM 300x250 $3.17


impressions
300x50 $0.96
DATES
[2016_09_01, 2016_10_31] 160x600 $6.22

WHERE 300x600 $3.81


insertion_order_id = 123456789
320x50 $3.14

728x90 $6.15

44
Amobee DataMine Training

ICA query example

SELECT media_channel_id device_type impressions cost


media_channel_id,
device_type, 1 PC 61,557,640 $433,389
sum(impression) impressions,
4 Mobile Phone 16,882,070 $115,113
sum(cost) cost
1 Unknown 7,583,113 $56,082
FROM
impressions, clicks, actions 2 PC 7,434,689 $128,409

DATES 4 Unknown 7,143,144 $48,621


[2016_01_01, 2016_02_28] 1 Mobile Phone 6,912,131 $52,059
WHERE 2 Unknown 4,271,376 $57,376
insertion_order_id = 123456789
2 Mobile Phone 3,129,382 $40,558

1 Tablet 3,071,813 $22,198

45
Amobee DataMine Training

CASE WHEN… THEN… ELSE… END statement

A CASE statement in DataMine is how we do IF/THEN/ELSE evaluation in-query.

It has a number of uses, but one common use is for grouping members of a dimension according to some
attribute.

, CASE WHEN package_name LIKE ‘%rt%’ THEN ‘Retargeting’


WHEN package_name LIKE ‘%pt%’ THEN ‘Prospecting’
WHEN package_name LIKE ‘%bt%’ THEN ‘Behavioral’
ELSE ‘Other’ END my_tactic

Note: Order matters (much like an if-else statement)

46
Amobee DataMine Training

CASE WHEN… THEN… ELSE… END statement

SELECT
package_name package
CASE WHEN package_name LIKE ‘%rt%’ THEN ‘Retargeting’
WHEN package_name LIKE ‘%pt%’ THEN ‘Prospecting’
WHEN package_name LIKE ‘%bt%’ THEN ‘Behavioral’
ELSE ‘Other’ END my_tactic
, sum(impression) tot_impressions
FROM impressions
DATES last_7_days
WHERE insertion_order_id=75764983092
package tot_impressions my_tactic tot_impressions
Homepage RT 6,299,824
Conversion RT 5,890,362 Behavioral 5,061,587
General PT 6,400,930
Prospecting 14,164,091
Weekend Heavy Up PT 6,426,370
New York PT 1,336,791 Retargeting 12,190,186
In Market BT 581,108
18-25 BT 4,480,479 47
Amobee DataMine Training

CASE WHEN… THEN… ELSE… END statement

SELECT
package_name package
CASE WHEN package_name LIKE ‘%rt%’ THEN ‘Retargeting’
WHEN package_name LIKE ‘%pt%’ THEN ‘Prospecting’
WHEN package_name LIKE ‘%bt%’ THEN ‘Behavioral’
ELSE ‘Other’ END my_tactic
, sum(impression) tot_impressions
FROM impressions
DATES last_7_days
WHERE insertion_order_id=75764983092
package tot_impressions my_tactic tot_impressions
Homepage RT 6,299,824 Behavioral 5,061,587
Conversion RT 5,890,362
General PT 6,400,930 Prospecting 14,164,091
Weekend Heavy Up PT 6,426,370
New York PT 1,336,791 Retargeting 12,190,186
In Market BT 581,108
18-25 BT 4,480,479 48
Amobee DataMine Training

Another ICA query example using CASE WHEN


SELECT
media_channel_id,
CASE WHEN viewability_video_duration>0
AND acc_100pct_vis_aud_sec >= (viewability_video_duration / 2.0) THEN 1 ELSE 0 END GroupM,
CASE WHEN viewability_video_duration>0
AND acc_100pct_vis_sec >= (viewability_video_duration / 2.0) THEN 1 ELSE 0 END GroupM_no_sound,
SUM(CASE WHEN viewability_video_duration>0
AND acc_100pct_vis_aud_sec >= (viewability_video_duration / 2.0) THEN 1 ELSE 0 END) GroupM_imp,
SUM(CASE WHEN viewability_video_duration>0
AND acc_100pct_vis_sec >= (viewability_video_duration / 2.0) THEN 1 ELSE 0 END) GroupM_no_sound_imp,
SUM(avoc_imp),
SUM(measurable_imp), media_
groupm
groupm_no groupm groupm_no sum(avoc sum(measurable sum(viewable sum(groupm
channel_id _sound _imp _sound_imp _imp) _imp) _imp) _vieable_imp)
SUM(viewable_imp),
SUM(groupm_viewable_imp) 2 1 1 9215765 9215765 8129914 9215765 9161765 9169866

FROM viewability 2 0 0 0 0 1384570 7588983 3650427 144909


DATES last_90_days
2 0 1 0 741049 25438 741049 735030 0
WHERE country_id IN (1,2)
AND media_channel_id = 2 2 1 0 1277247 0 1135635 1277247 1271104 1268938

49
Amobee DataMine Training

DataMine Helpful Tips

50
Amobee DataMine Training

Writing efficient ICA queries

• The smaller the DATES range, the more efficient your query is.
• String matching is expensive. Filter on IDs where possible

51
Amobee DataMine Training

COUNT vs. ESTIMATE DISTINCT

COUNT(DISTINCT ….) functionality is not allowed on ICA tables

• DMP tables
• COUNT(DISTINCT user_id)

• ICA tables
• ESTIMATE(DISTINCT user_id)

ESTIMATE should get you within ~3% of the true number.

52
Amobee DataMine Training

Castlong

Castlong can be thought of as a “round down” function and it can be used to create numeric buckets as
dimensions.
• Multiplying by 1,000 provides the CPM at $1 increments, for example:
• castlong(cost*1000)1

• Multiplying by 2,000 and dividing by 2 provides price buckets of .50 increments, for example:
• castlong(cost*2000)/2

• Multiplying by 10,000 and dividing by 10 provides price buckets of .10 increments, for example:
• castlong(cost*10000)/10

53
Amobee DataMine Training

Variables
To simplify query syntax and avoid mistakes due to duplications, use SET to define a variable and use ${ } to call it

SET group_a = 1601111111,1602222222,1603333333,1604444444;


SET group_b = 1605555555,1606666666,1607777777,1608888888; Don’t forget the
semicolon ‘;’ at the end
SELECT
CASE
WHEN line_item_id in (${group_a}) THEN ‘Group A’
WHEN line_item_id in (${group_b}) THEN ‘Group B’
ELSE ‘Unknown’
END
, sum(impression)
FROM impressions
DATES [2018_01_01, 2018_01_31]
WHERE line_item_id IN (${group_a},${group_b})

54
Amobee DataMine Training

Thank you!

55

You might also like