General formulas and definitions¶
Formulas for A/B testing¶
The main metrics to perform A/B testing are described in [Stu15]. Let us consider two variants \(X_A\) and \(X_B\) for testing.
The error probability or probability of \(X_B > X_A\) is denoted as
where \(f(x_A, x_B)\) is the joint probability distribution, under the assumption of independence, i.e. \(f(x_A, x_B) = f(x_A) f(x_B)\).
The expected loss function given a joint posterior is the expected value of the loss function. The loss function is the expected uplift lost by choosing a given variant. If variant \(X_B\) is chosen we have
Other metrics also considered are the relative expected loss or uplift and credible intervals. A credible interval is a region which has a specified probability of containing the true value.
Formulas for Multivariate testing¶
Let us first introduce some properties of the distribution of the maximum of a set of independent random variables with support on the whole real line.
The cumulative distribution function is
where \(F_{X_i}(z)\) is the cdf of each random variable \(X_i\). The probability density functions is obtain after derivation
where \(f_{X_i}(z)\) is the pdf of each random variable \(X_i\).
The probability to beat all is defined as
The expected loss function vs all is defined as
Take \(Y = \underset{j \neq i}\max{X_j}\), then we have
where \(F^*_{X_i}(y) = \int_{-\infty}^y x_i f(x_i) \mathop{dx_i}\).
References¶
[Stu15] | C. Stucchio. Bayesian A/B Testing at VWO. Visual Web Optimizer, 2015. URL: https://www.chrisstucchio.com/pubs/VWO_SmartStats_technical_whitepaper.pdf. |