You are on page 1of 54

Testing Emails for Dramatic

Improvements
Michael Monaghan
If you came for stuff about APIs, a few of
my best examples were shut
down/deprecated. I’ll be giving another
talk or writing a how-to about it!
1.Overview
How To Run A Test
Stats 101
Pitfalls
Rules & Best Practices
TOOLS
Q&A
Overview
Who I Am
Lived abroad for a while, came back to go to Uni... still need to do that.

Sourcing basics at Sourcing Institute

More advanced learning with Dave Mendoza.

Sourcer/Researcher for Iris Libby Recruitment Consultants

First Technical Sourcer at Etsy.

Head of Talent then Head of Sourcing for InterviewJet.

NOW: Tech. Recruiter at Blue Apron.
QQs for Audience
Assumptions I’m Making
You care about your response rate.

You send emails as part of your day-to-day business. It’s a unit of work.

You’re adventurous enough to explore some tools (it’s tedious and unreliable to
gather data manually).
What is AB Testing?
AB Testing, aka Split Testing or bucket testing, is comparing two versions of
something to see which performs better.
Where Can I Use it?
Emails

Phone Scripts

Internal Communications (Referral Programs anyone?)

Job Descriptions

Career Sites/Page layouts

Landing Pages

Event Invites

Interview Processes
Why should I be A/B testing?
1. It helps you realize the actual impact of proposed changes

2. It lowers risk and helps you AND OTHERS trust your data - less time arguing
on executing and more time actually doing

3. Takes the guesswork out of optimization - you don’t have be a mind-reader
and guess the best way to get through to someone

4. Probably cheapest way improve conversions - every email (or phone script, or
job description) is valuable, make the most of it
Key Terms
- Conversion Metric - The metric that defines what you want or are asking
people to do.
- Ie. For Emails, a possible conversion would be a reply. It could also be to click on a link.

- Bucket/Sample/Population group - This is the group of people you’re using in
your test.
What is AB Testing NOT?
Not a big win every time

Isn’t for everyone or every company

Not validation of tricks, guesswork or psychology gimmicks

Not HIPPO

Not a solution to poor choice of sample/population. Ie. If your
sourcing/targeting/research sucks, it might skew your results.
How To Run A Test
4 Steps to AB Testing

1. Analyze & Ideate
2. Form a Hypothesis
3. Construct an Experiment/Test
4. Interpret Results
Analyze & Ideate

Quantitative data tells you where to test
Analyze & Ideate - Quantitative Data
Look at past campaigns or emails!

1. Do you know who opened the email and responded?

2. Do you know who opened and didn’t respond?

3. Do you know who didn’t even open your email (as far as you know)?

What else? Jobs page traffic, CRM/ATS Interactions
Analyze & Ideate - Qualitative Data

Qualitative data gives you an idea of what should
be tested.
Analyze & Ideate - Qualitative Data
Candidate Feedback
“Your messaging was too long!” (Content)
“I’m paranoid about attachments/shortlinks so I don’t click
on them” (Content)
“I actually never received your email, it went to spam”
(Practices)
Analyze & Ideate - Qualitative Data

Survey Data
“Not the right time to contact me, I just started
a new role”
(Poor Sourcing/Research)
Analyze & Ideate - Qualitative Data

Heat Mapping
Are people reading the PDFs I send? How
much are they reading?
Form A Hypothesis
IF [variable] then [result], because [rationale]

Variable = element being modified

Result (Predicted outcome) = Use data to determine size of effect (10% more
responses = X more phone screens based on prior numbers)

Rationale = Demonstrate your candidate/target knowledge (What assumption
will be proven wrong if the experiment is a draw or loses?)
Form A Hypothesis
Weak Hypothesis = If we personalize, response rate will increase

Strong Hypothesis = If we personalize the first sentence of the email past the
salutation, response rate will increase by 10% because candidates will see that
we’re doing basic research on them and thus, our outreach is likely to seem more
relevant.
Form A Hypothesis
If you don’t really have any idea how big an effect your changes will have, but
you’ve figured out where you might need some work:

Look to goals and work backwards

Go for Confidence

Run a bunch of tests
Construct an Experiment

What are you saying?
How does it look?
How does it work?
Construct an Experiment

How big is your sample size?
What group of people are you targeting?
How will you track your results?
When does this experiment start and end?
Construct an Experiment at Scale
Drip Email Marketing
Most large platforms for email, ie. Mailchimp
CRMs are mighty cheap and lightweight nowadays
Plug the two into each other, and maybe your ATS too
Interpret Results

How confident am I that the observed different
from my experiment was not due to chance?
90% Statisical Significance = 10% (aka 1 in 10) probability observed difference
was due to chance.

There are already enough variables we can’t control, let’s limit or guesswork or
shotgun approach as much as possible.
- Put constraints on your problem so it’s easier to solve.
Interpret Results
Interpret Results
Stats 101
Statistical Significance and Why It’s Important
We want to trust our numbers.

We want to make sure we’re moving the right direction.

Statistical significance gives us a way to measure confidence in a result.

95% statistical significance = 1 in 20 chance observed difference was due to
chance. How much risk can you take?

Already enough variables in this world, let’s grab hold of what we can.
Statistical Significance and Why It’s Important

Let’s say you toss a coin 100 times:
Heads: 46
Tails: 54
Statistical Significance and Why It’s Important
Sample Size
Use a calculator. (http://www.evanmiller.org/ab-testing/sample-size.html)
Sample Size
Sample Size - What does that tell us?
If we don’t have a lot of people to hit up, go for high-impact changes.

Test entirely different emails.

Test many at the same time.

Cut it at a week if no major uptick.

Don’t sit around and suffer with a low converting template.
Statistical Significance
Why are we talking about this?
Because it turns out math proves how dumb we are. We peek at results and stop
experiments too soon and it’s actually a big problem.

[I’ve never done more than a bit of reading about the following]

If you want flexibility and are good at math (or know someone who is):

Sequential Experiment Design - Lets you set up points in advance where you
will decide whether or not to continue and it gives you the correct significance
levels.

Bayesian Experiment Design - You can stop your experiment at any time and
make perfectly valid inferences.
Rules & Best Practices
Document and Share your findings!
Track your results! Most basic, keep a spreadsheet detailing experiments, what
was changed and how it was changed, as well as metrics.

Increases Company Transparency

Helps others know where tests are going on

Makes it easier to call tests when they’re winning/losing

Makes learnings available to everyone in company
Rules & Best Practices
Don’t stop early.

Test frequently.

Follow results down the funnel.

SS at least 95%! 99% is where you SHOULD be at, if you have the volume.

Keep backlog of test hypotheses.

Build next test as current one is running. STACK YOUR GAINS!
Ultimately, remember that your goal isn’t
necessarily a higher conversion rate, but rather
whatever that higher conversion rate will enable
you to do.
Pitfalls
If you create the rules which govern how
a test will be run before it starts gettings
results, you eliminate most common
ways of introducing bias. The more
decisions you make after results start
coming in, the more traps you are setting
for yourself.
Pitfalls
Stopping Tests Early

http://www.yesware.com/blog/best-time-to-send-email/
Pitfalls
Pitfalls

Regression to the Mean
Pitfalls

A/B Tests Over-estimate uplift (Winner’s Curse)
Pitfalls

Not Running Tests Long Enough
Pitfalls
Do you send only a few emails a day? A/B testing might not be great option for
you. Rather, go for big radical changes and quickly. The idea is you’re going for
massive lifts, like 50%.

Time is money so don’t waste time on a test that
could take months.
Pitfalls

Selection effect - This occurs when we wrongly
assume some portion of the audience represents
the totality of the audience.
TOOLS
TOOLS

CRMs - Salesforce, Zoho, etc.
Email Platforms - Mailchimp, etc.
Drip/Email Automation - Outreach.io, IfNoReply,
MixMax
Q&A