Professional Documents
Culture Documents
WHY
A /B-TESTS?
We make a change Effect?
What is the problem with this?
⏰ time
We make a change
before after
compare
📆 Weekday 🛠 Product changes
📺 Content changes
🎉 Holidays in the product
⏰ time
We make a change
📰 Press
🌦 Weather
👻 Something is trending
📢 Marketing
on social media
💥 Campaign
Statistical
What change?? significance
in the effect
With an A/B-test the
change exists
Group A: gets no
change
simultaneously with
the control group, and
therefore minimizes
Group B: gets a the number of
change
variables that can
affect the outcome.
Statistical
Our change significance
in the effect
WHEN
A /B-TESTS?
Different purposes
1.
PURE OPTIMIZATION
What version gives the best
effect?
Example:
Product image, title
optimization
2.
RELEASES
To make sure a “package of
changes” don’t have a
negative /unexpected
effect.
3.
VALIDATE A HYPOTHESIS
A /B tests are perfect for
this!
Primary reason to work with
A /B tests.
If we make change then we’ll
achieve effect
VALIDATED
LEARNING
Learning by validating
connection between change
and effect.
TIPS &
PITFALLS
Make sure there is enough
traffic for your A/B test.
FAIL: The amount of traffic
Not achieving needed depends on:
statistical • Number of variants
significance • Baseline
within a • Minimum Detectable Effect
reasonable (MDE)
time. Use
https://conversionxl.com/ab-test-calculator/
https://www.optimizely.com/sample-size-calculator/
https://abtestguide.com/bayesian/
Make sure there are clear
goals and KPIs that you can
FAIL: affect.
Not knowing To know what effect we
how to evaluate achieve, we need to measure
the A /B test. the effect with relevant KPIs.
lacking a clear,
relevant and
measurable
goal.
Don’t forget to align with
FAIL: overall goals.
We achieve Make sure you have
positive results measurable overall goals.
from our A /B
tests, but don’t
realise we’re
suboptimizing
the product, or
cannibalising on
other parts.
Only test one change per
FAIL: variant in order to be able to
understand what change
Not knowing actually affects the result.
what actually
You can afford to learn when
affected the you have enough traffic.
results because If you’re not nearly close to
of too many your goal you just need to
changes. innovate and change
Hard to learn everything.
anything.
Test wide. Don’t chicken out!
To maximize and speed up
learning. Try many different
variants. Different types of
FAIL: solutions. Big differences
Slow learning instead of small details.
tempo and small Small changes = small
learning leaps. effects = small learnings
Big changes = big effects =
big learnings
Build as little as possible.
FAIL: The change should be good
It takes a long enough to show real users,
time to build / but don’t forget it’s just an
experiment.
create A /B
Only when we’ve validated a
tests. Too solution we build it “for
complex real”, otherwise we risk
solutions that unnecessary waste.
slow down the
learning tempo.
FAIL:
Incorrect results
(and decisions)
because of short
or incomplete
experiment
periods.
WEEKLY VARIATIONS
Sunday
Sunday
Sunday
Sunday
Sunday
Sunday
Sunday
Sunday
1. Always test whole cycles,
and at least two cycles.
FAIL: In this case: Test for whole
Incorrect results weeks and at least two full
weeks, due to weekly
(and decisions) variations.
because of short
or incomplete
experiment
periods.
CONVERSION RATE
THE CHANGE CURVE
😱
🤬
1. Always test whole cycles,
and at least two cycles.
FAIL: In this case: Test for whole
Incorrect results weeks and at least two full
weeks, due to weekly
(and decisions) variations.
because of short 2. Test long enough for the
or incomplete change curve to stabilize.
experiment May vary slightly depending
periods. on context and type of
change.
FAIL:
Incorrect results
(and decisions)
because of the
influence of
external
variables.
📆 Weekday 🎉 Holidays 🛠 Product changes
👻 Something is trending
🌦 Weather 📢 Marketing
on social media
Think about what might
affect the results.
FAIL: Are we testing under good
Incorrect results conditions? Is the period
representative of "normal
(and decisions) use" of the service?
because of the
influence of
external
variables.
FAIL:
Assuming that a
positive
outcome of an
A /B test
directly means
that a
hypothesis is
validated.
BIAS