Measuring results in A/B tests
The improvement (or lack thereof) observed is what matters in an A/B test.
Your goal is to be able make statements like this: “the group that received the new version of the email performed 2-7% better than the control group.”
You should always report a range of improvement/worsening.
Here’s how to calculate open and click rates:
Click rate
Click rate and improvement equations |
Always use unique counts (unique email opens and/or clicks) and the number of successfully delivered emails.
Margin of error (MOE)
All computed averages/rates have inherent variance/error.
An A/B test is what is more generally known as a “Bernoulli trial” (a random experiment with two possible outcomes). The “standard error” calculation is very straightforward and involves using (the probability of an open or a click — the same number as the rate you calculated above) and N (the number of observations in the group).
The way to calculate standard error is
Standard Error Equation |
probability of observing an event |
For a given statistical confidence level, the standard error needs to be multiplied by a number
One-tail Z score at 1-alpha |
Margin of error equation |
For a 95% confidence interval, you report your individual open and click rates as
95% confidence interval equation |
(95% is the recommended confidence interval for most A/B tests)
When you report improvement rates (the difference of two numbers with MOEs), the equation for the standard error is
Standard error equation for comparing the difference between two trials/experiments |
where p1 and p2 are the rates for each group, respectively, and n1 and n2 are the number of samples in each group.
Thus, you report the improvement rate as
Improvement rate equation |
As can be seen in the graph below, the standard error peaks when p = 0.5. A/B tests where the conversion rate isn’t on the high or low end tend to require many more samples.
In general, more data lowers the error rate.
More data lowers the error rate |
In the next article, we will address how long an A/B test should run.
No comments:
Post a Comment