Skip to main content

Technical power details

Here we document the technical details behind GrowthBook power calculations and minimum detectable effect (MDE) calculations for both frequentist and Bayesian engines.

Frequentist engine details

Frequentist power

Below we describe technical details of our implementation. First we start with the definition of power.

Power is the probability of a statistically significant result.

We use the terms below throughout. Define:

  1. the false positive rate as α\alpha (GrowthBook default is α=0.05\alpha=0.05).
  2. the critical values Z1α/2=Φ1(1α/2)Z_{1- \alpha / 2}= \Phi^{-1}(1-\alpha/2) and Z1α=Φ1(1α)Z_{1-\alpha}= \Phi^{-1}(1-\alpha) where Φ1\Phi^{-1} is the inverse CDF of the standard normal distribution.
  3. the true relative treatment effect as Δ\Delta, its estimate as Δ^\hat{\Delta} and its estimated standard error as σ^Δ\hat{\sigma}_{\Delta}. Note that as the sample size nn increases, σ^Δ\hat{\sigma}_{\Delta} decreases by a factor of 1/n1/\sqrt{n}.

We make the following assumptions:

  1. equal sample sizes across control and treatment variations. If unequal sample sizes are used in the experiment, use the smaller of the two sample sizes. This will produce conservative power estimates.
  2. equal variance across control and treatment variations;
  3. observations across users are independent and identically distributed;
  4. all metrics have finite variance; and
  5. you are running a two-sample t-test. If in practice you use CUPED, your power will be higher.

For a 1-sided test, the power is

π=P(Δ^σ^Δ>Z1α)=P(Δ^Δσ^Δ>Z1αΔσ^Δ)=1Φ(Z1αΔσ^Δ).\begin{align} \pi = P\left(\frac{\hat{\Delta}}{\hat{\sigma}_{\Delta}} > Z_{1-\alpha}\right)=P\left(\frac{\hat{\Delta}-\Delta}{\hat{\sigma}_{\Delta}} > Z_{1-\alpha}-\frac{\Delta}{\hat{\sigma}_{\Delta}}\right)=1 - \Phi\left(Z_{1-\alpha}-\frac{\Delta}{\hat{\sigma}_{\Delta}}\right). \end{align}

For a 2-sided test (all GrowthBook tests are 2-sided), power is composed of the probability of a statistically significant positive result and a statistically significant negative result. Using the same algebra as in Equation 1 (except using Z10.5αZ_{1-0.5\alpha} for the critical value), the probability of a statistically significant positive result is

πpos=1Φ(Z1α/2Δσ^Δ).\begin{align} \pi_{pos} &= 1 - \Phi\left(Z_{1-\alpha/2}-\frac{\Delta}{\hat{\sigma}_{\Delta}}\right). \end{align}

The probability of a statistically significant negative result is

πneg=P(Δ^σ^Δ<Zα/2)=P(Δ^Δσ^Δ<Zα/2Δσ^Δ)=Φ(Zα/2Δσ^Δ).\begin{align} \pi_{neg} &= P\left(\frac{\hat{\Delta}}{\hat{\sigma}_{\Delta}} < Z_{\alpha/2}\right)=P\left(\frac{\hat{\Delta}-\Delta}{\hat{\sigma}_{\Delta}} < Z_{\alpha/2}-\frac{\Delta}{\hat{\sigma}_{\Delta}}\right)=\Phi\left(Z_{\alpha/2}-\frac{\Delta}{\hat{\sigma}_{\Delta}}\right). \end{align}

For a 2-sided test, the power equals

π=1Φ(Z1α/2Δσ^Δ)+Φ(Zα/2Δσ^Δ).\begin{align} \pi = 1 - \Phi\left(Z_{1-\alpha/2}-\frac{\Delta}{\hat{\sigma}_{\Delta}}\right) + \Phi(Z_{\alpha/2} - \frac{\Delta}{\hat{\sigma}_{\Delta}}). \end{align}

Frequentist minimum detectable effect

Some customers want to know what effect size is required to produce at least π\pi power.
The minimum detectable effect is the smallest Δ\Delta for which nominal power (e.g., 80%) is achieved.

Below we describe commonly used MDE calculations, though we do not use these at GrowthBook.
For a 1-sided test there is a closed form solution for the MDE. Solving Equation 1 for Δ\Delta produces

MDE=σ^Δ(Φ1(1α)Φ1(1π)).\begin{align} \text{MDE} &= \hat{\sigma}_{\Delta}\left(\Phi^{-1}(1 - \alpha)-\Phi^{-1}(1 - \pi)\right). \end{align}

In the 2-sided case there is no closed form solution.
Often in practice the MDE is defined as the solution to inverting Equation 2.
This ignores the negligible term in Equation 3, and produces power estimates very close to π\pi:

MDEtwo-sided=σ^Δ(Φ1(1α/2)Φ1(1π)).\begin{align} \text{MDE}_{\text{two-sided}} &= \hat{\sigma}_{\Delta}\left(\Phi^{-1}(1 - \alpha/2)-\Phi^{-1}(1 - \pi)\right). \end{align}

This approach works when effects are defined on the absolute scale, where the uncertainty of effect estimate does not depend upon the true absolute effect. For relative inference, this does not hold, and so GrowthBook uses a different approach. The terms below are used to help define the variance of the sample lift.

  1. Define ΔAbs\Delta_{Abs} as the absolute effect.
  2. Define μA\mu_{A} as the population mean of variation AA and σ2\sigma^{2} as the population variance.
  3. For variation BB analogously define μB\mu_{B}; recall that we assume equal variance across treatment arms.
  4. Define NN as the per-variation sample size.
  5. Define the sample counterparts as (μ^A\hat{\mu}_{A}, σ^A2\hat{\sigma}_{A}^{2}, μ^B\hat{\mu}_{B}, and σ^B2\hat{\sigma}_{B}^{2}).

Then the variance of the sample lift is

σ^Δ2=σ2N1μA2+σ2NμB2μA4=σ2N1μA2+σ2N(μA+ΔAbs)2μA4.\begin{align} \hat{\sigma}_{\Delta}^{2} &= \frac{\sigma^{2}}{N}\frac{1}{\mu_{A}^{2}} + \frac{\sigma^{2}}{N} *\frac{\mu_{B}^{2}}{\mu_{A}^{4}} \\&= \frac{\sigma^{2}}{N}\frac{1}{\mu_{A}^{2}} + \frac{\sigma^{2}}{N} *\frac{\left(\mu_{A}+\Delta_{Abs}\right)^{2}}{\mu_{A}^{4}}. \end{align}

Therefore, when inverting the power formula above to find the minimum Δ\Delta that produces at least 80% power, the uncertainty term σ^Δ\hat{\sigma}_{\Delta} changes as Δ\Delta changes.
To find the MDE we solve for the equation below, where we make explicit the dependence of σ^Δ\hat{\sigma}_{\Delta} on Δ\Delta:

Δσ^Δ(Δ)=Φ1(1α/2)Φ1(1π).\begin{align*} \frac{\Delta}{\hat{\sigma}_{\Delta}(\Delta)} = \Phi^{-1}\left({1-\alpha/2}\right) - \Phi^{-1}(1 - \pi). \end{align*}

Define the constant k=Φ1(1α/2)Φ1(1π)k = \Phi^{-1}\left({1-\alpha/2}\right) - \Phi^{-1}(1 - \pi). We solve for μB\mu_{B} in:

(μBμA)/μAVar(Δ^)=k    (μBμA)2=k2μA2Var(Δ^)=k2μA2(σ2N1μA2+σ2NμB2μA4).\frac{(\mu_{B}-\mu_{A})/\mu_{A}}{\sqrt{\text{Var}(\hat{\Delta})}} = k \iff (\mu_{B}-\mu_{A})^{2} =k^{2}\mu_{A}^{2}\text{Var}(\hat{\Delta}) = k^{2}\mu_{A}^{2}\left(\frac{\sigma^{2}}{N}\frac{1}{\mu_{A}^{2}} + \frac{\sigma^{2}}{N} *\frac{\mu_{B}^{2}}{\mu_{A}^{4}}\right).

Rearranging terms shows that μB2(1σ2Nk2μA2)+μB(2μA)+(μA2k2σ2N)=0.\mu_{B}^{2}\left(1-\frac{\sigma^{2}}{N}\frac{k^{2}}{\mu_{A}^{2}}\right) + \mu_{B}\left(-2\mu_{A}\right) + \left(\mu_{A}^{2}-k^{2}\frac{\sigma^{2}}{N}\right) = 0.

This is quadratic in μB\mu_{B} and has solution

μB=2μA±4μA24(1σ2Nk2μA2)(μA2k2σ2N)2(1σ2Nk2μA2)=μA±μA2(1σ2Nk2μA2)(μA2k2σ2N)(1σ2Nk2μA2).\mu_{B} = \frac{2 \mu_{A} \pm \sqrt{4 \mu_{A}^{2}-4\left(1-\frac{\sigma^{2}}{N}\frac{k^{2}}{\mu_{A}^{2}}\right)\left(\mu_{A}^{2}-k^{2}\frac{\sigma^{2}}{N}\right)}}{2\left(1-\frac{\sigma^{2}}{N}\frac{k^{2}}{\mu_{A}^{2}}\right)} = \frac{\mu_{A} \pm \sqrt{\mu_{A}^{2}-\left(1-\frac{\sigma^{2}}{N}\frac{k^{2}}{\mu_{A}^{2}}\right)\left(\mu_{A}^{2}-k^{2}\frac{\sigma^{2}}{N}\right)}}{\left(1-\frac{\sigma^{2}}{N}\frac{k^{2}}{\mu_{A}^{2}}\right)} .

The discriminant reduces to

k2σ2N(2σ2Nk2μA2).k^{2} * \frac{\sigma^{2}}{N} \left(2 - \frac{\sigma^{2}}{N} * \frac{k^{2}}{\mu_{A}^{2}}\right).

so a solution for μB\mu_{B} exists if and only if

2σ2Nk2μA2>0    2>σ2Nk2μA2    N>σ2k22μA2.\begin{align} 2 - \frac{\sigma^{2}}{N} * \frac{k^{2}}{\mu_{A}^{2}} > 0 &\iff 2 > \frac{\sigma^{2}}{N} * \frac{k^{2}}{\mu_{A}^{2}} \iff N > \frac{\sigma^{2}k^{2}}{2\mu_{A}^{2}}. \end{align}

Similarly, the MDE returned can be negative if the denominator is negative, which is nonsensical.
We return cases only where the denominator is positive, which occurs if and only if:

(1σ2Nk2μA2)>0    (1σ2Nk2μA2)>0    N>σ2k2μA2.\begin{align} \left(1-\frac{\sigma^{2}}{N}\frac{k^{2}}{\mu_{A}^{2}}\right) > 0 \iff \left(1-\frac{\sigma^{2}}{N}\frac{k^{2}}{\mu_{A}^{2}}\right) > 0 \iff N > \frac{\sigma^{2}k^{2}}{\mu_{A}^{2}}. \end{align}

The condition in Equation 10 is stricter than the condition in Equation 9.

In summary, there will be some combinations of (μA,σ2)(\mu_{A}, \sigma_{2}) where the MDE does not exist for a given NN. If α=0.05\alpha=0.05 and π=0.8\pi=0.8, then k2.8k\approx 2.8. Therefore, a rule of thumb is that NN needs to be roughly 9 times larger than the ratio of the variance to the squared mean to return an MDE. In these cases, NN needs to be increased.

Sequential testing

To estimate power under sequential testing, we adjust the variance term σ^δ\hat{\sigma}_{\delta} to account for sequential testing, and then input this adjusted variance into our power formula. We assume that you look at the data only once, so our power estimate below is a lower bound for the actual power under sequential testing. Otherwise we would have to make assumptions about the temporal correlation of the data generating process.

In sequential testing we construct confidence intervals as

Δ^±σ^N2(Nρ2+1)N2ρ2log(Nρ2+1α)\begin{align*} \hat{\Delta} &\pm \hat{\sigma}*\sqrt{N} * \sqrt{\frac{2(N\rho^2 + 1)}{N^2\rho^2}\log\left(\frac{\sqrt{N\rho^2 + 1}}{\alpha}\right)} \end{align*}

where

ρ=2log(α)+log(2log(α)+1)N\rho = \sqrt{\frac{-2\text{log}(\alpha) + \text{log}(-2 \text{log}(\alpha) + 1)}{N^*}}

and NN^{\star} is a tuning parameter. This approach relies upon asymptotic normality. For power analysis we rewrite the confidence interval as

Δ^±σ^N2(Nρ2+1)N2ρ2log(Nρ2+1α)Z1α/2Z1α/2=Δr^±σ~Z1α/2\begin{align*} \hat{\Delta} &\pm \hat{\sigma}*\sqrt{N} * \sqrt{\frac{2(N\rho^2 + 1)}{N^2\rho^2}\log\left(\frac{\sqrt{N\rho^2 + 1}}{\alpha}\right)}\frac{Z_{1-\alpha/2}}{Z_{1-\alpha/2}} \\&=\hat{\Delta_{r}} \pm \tilde{\sigma}Z_{1-\alpha/2} \end{align*}

where

σ~=σ^N2(Nρ2+1)N2ρ2log(Nρ2+1α)1Z1α/2\tilde{\sigma} = \hat{\sigma}*\sqrt{N}\sqrt{\frac{2(N\rho^2 + 1)}{N^2\rho^2}\log\left(\frac{\sqrt{N\rho^2 + 1}}{\alpha}\right)}\frac{1}{Z_{1-\alpha/2}}.

We use power analysis described above, except we substitute σ~2\tilde{\sigma}^{2} for σ^Δ2\hat{\sigma}_{\Delta}^{2}.

Bayesian engine details

Bayesian power

For Bayesian power analysis, we let users specify the prior distribution of the treatment effect. We then estimate Bayesian power, which is the probability that the (1α)(1 - \alpha) credible interval does not contain 0.

We assume a conjugate normal-normal model, as follows:

ΔN(μprior,σprior2)Δ^ΔN(Δ,σ^Δ2).\begin{align*} \Delta &\stackrel{}{\sim}\mathcal{N}\left(\mu_{prior}, \sigma_{prior}^{2}\right) \\ \hat{\Delta}|\Delta &\stackrel{}{\sim}\mathcal{N}\left(\Delta, \hat{\sigma}_{\Delta}^{2}\right). \end{align*}

In words, the model has two parts: 1) the normal prior for the treatment effect, which is specified by you; and 2) conditional upon the treatment effect, the estimated effect is normally distributed. The normal prior has several advantages, including: 1) bell-shaped distribution around the prior mean, so that extreme estimates will be shrunk more towards the prior than moderate estimates; 2) the ability to specify two moments, which is often the right amount of information for a prior; and 3) simplicity.
The conditional normality of the effect estimate is motivated by the central limit theorem.

We use the normal distribution below to approximate the posterior:

ΔΔ^N(Ω1ω,Ω1)Ω=1/σprior2+1/σ^Δ2ω=μprior/σprior2+Δ^/σ^Δ2.\begin{align*} \Delta|\hat{\Delta} &\stackrel{}{\sim}\mathcal{N}\left(\Omega^{-1}\omega, \Omega^{-1}\right) \\\Omega &= 1/\sigma_{prior}^{2} + 1/\hat{\sigma}_{\Delta}^{2} \\\omega &= \mu_{prior}/\sigma_{prior}^{2} + \hat{\Delta}/\hat{\sigma}_{\Delta}^{2}. \end{align*}

This is an approximation to the posterior because Δ\Delta affects σ^Δ2\hat{\sigma}_{\Delta}^{2}. We tested this approximation through extensive simulations, and found it had comparable coverage and mean squared error to a posterior distribution empirically sampled using Metropolis Hastings.

We define rejection as the 100(1α)100(1-\alpha)% confidence interval not containing zero.
For our posterior approximation, this occurs if the posterior mean for ΔΔ^\Delta|\hat{\Delta} (i.e., Ω1ω\Omega^{-1}\omega) divided by its posterior standard deviation(i.e., Ω1)\left(\text{i.e., }\sqrt{\Omega^{-1}}\right) is beyond the the appropriate critical threshold ZZ^{\star} (e.g., Φ1(0.975)\Phi^{-1}(0.975) for α=0.05\alpha=0.05).

Inside of a Bayesian framework, it can help to permit the case where the prior model is misspecified.
That is, the prior specified by the customer differs from the true prior that generates the treatment effect.
We permit misspecification of the prior for Δ\Delta, as we assume that the true data generating process (DGP) is ΔN(μ,σ2)\Delta \stackrel{}{\sim}\mathcal{N}\left(\mu_{\star}, \sigma_{\star}^{2}\right), while the specified DGP has ΔN(μprior,σprior2)\Delta \stackrel{}{\sim}\mathcal{N}\left(\mu_{prior}, \sigma_{prior}^{2}\right). We assume the prior is specified on the relative scale.

In derivations below we use the marginal distribution of Δ^\hat{\Delta}, which we find using its moment generating function:

E[exptΔ^]=EΔ[E[exptΔ^Δ]]=EΔ[exptΔ+t2σ^Δ2]=expt2σ^Δ2EΔ[exptΔ] =expt2σ^Δ2exptμ+t2σ2N(μ,σ2+σ^Δ2).\begin{align*} E\left[\exp^{t\hat{\Delta}}\right] &= E_{\Delta}\left[E\left[\exp^{t\hat{\Delta}}|\Delta\right]\right] \\&= E_{\Delta}\left[\exp^{t\Delta + t^{2}\hat{\sigma}_{\Delta}^{2}}\right] \\&= \exp^{ t^{2}\hat{\sigma}_{\Delta}^{2}}E_{\Delta}\left[\exp^{t\Delta}\right]\ \\&= \exp^{ t^{2}\hat{\sigma}_{\Delta}^{2}}\exp^{t\mu^{\star} + t^{2}\sigma_{\star}^{2}} \\&\stackrel{}{\sim}\mathcal{N}\left(\mu^{\star}, \sigma_{\star}^{2}+\hat{\sigma}_{\Delta}^{2}\right). \end{align*}

For a 2-sided test the probability of rejection is

P(Ω1ωΩ1/2>Z1α/2)=P((1σprior2+1σ^Δ2)1/2(μpriorσprior2+Δ^σ^Δ2)>Z1α/2)=P((μpriorσprior2+Δ^σ^Δ2)>(1σprior2+1σ^Δ2)1/2Z1α/2)=P((μpriorσprior2+Δ^σ^Δ2)>(1σprior2+1σ^Δ2)1/2Z1α/2)+P((μpriorσprior2+Δ^σ^Δ2)<(1σprior2+1σ^Δ2)1/2Z1α/2)=P(Δ^>σ^Δ2[(1σprior2+1σ^Δ2)1/2Z1α/2μpriorσprior2])+P(Δ^<σ^Δ2[(1σprior2+1σ^Δ2)1/2Z1α/2+μpriorσprior2])=P(Δ^μσ^Δ2+σ2>σ^Δ2[(1σprior2+1σ^Δ2)1/2Z1α/2μpriorσprior2]μσ^Δ2+σ2)+P(Δ^μσ^Δ2+σ2<σ^Δ2[(1σprior2+1σ^Δ2)1/2Z1α/2+μpriorσprior2]μσ^Δ2+σ2)=1Φ(σ^Δ2[(1σprior2+1σ^Δ2)1/2Z1α/2μpriorσprior2]μσ^Δ2+σ2)+Φ(σ^Δ2[(1σprior2+1σ^Δ2)1/2Z1α/2+μpriorσprior2]μσ^Δ2+σ2).\begin{align*} &P\left(\left|\frac{\Omega^{-1}\omega}{\Omega^{-1/2}}\right| > Z_{1-\alpha/2}\right) \\&=P\left(\left|\left(\frac{1}{\sigma_{prior}^{2}} + \frac{1}{\hat{\sigma}_{\Delta}^{2}}\right)^{-1/2} \left(\frac{\mu_{prior}}{\sigma_{prior}^{2}} + \frac{\hat{\Delta}}{\hat{\sigma}_{\Delta}^{2}}\right)\right|> Z_{1-\alpha/2}\right) \\&= P\left(\left|\left(\frac{\mu_{prior}}{\sigma_{prior}^{2}} + \frac{\hat{\Delta}}{\hat{\sigma}_{\Delta}^{2}}\right)\right|> \left(\frac{1}{\sigma_{prior}^{2}} + \frac{1}{\hat{\sigma}_{\Delta}^{2}}\right)^{1/2} Z_{1-\alpha/2}\right) \\&=P\left(\left(\frac{\mu_{prior}}{\sigma_{prior}^{2}} + \frac{\hat{\Delta}}{\hat{\sigma}_{\Delta}^{2}}\right)> \left(\frac{1}{\sigma_{prior}^{2}} + \frac{1}{\hat{\sigma}_{\Delta}^{2}}\right)^{1/2} Z_{1-\alpha/2}\right) \\&+ P\left(\left(\frac{\mu_{prior}}{\sigma_{prior}^{2}} + \frac{\hat{\Delta}}{\hat{\sigma}_{\Delta}^{2}}\right)< -\left(\frac{1}{\sigma_{prior}^{2}} + \frac{1}{\hat{\sigma}_{\Delta}^{2}}\right)^{1/2} Z_{1-\alpha/2}\right) \\&= P\left(\hat{\Delta}> \hat{\sigma}_{\Delta}^{2}\left[\left(\frac{1}{\sigma_{prior}^{2}} + \frac{1}{\hat{\sigma}_{\Delta}^{2}}\right)^{1/2} Z_{1-\alpha/2} - \frac{\mu_{prior}}{\sigma_{prior}^{2}}\right]\right) \\&+ P\left(\hat{\Delta}< -\hat{\sigma}_{\Delta}^{2}\left[\left(\frac{1}{\sigma_{prior}^{2}} + \frac{1}{\hat{\sigma}_{\Delta}^{2}}\right)^{1/2} Z_{1-\alpha/2} + \frac{\mu_{prior}}{\sigma_{prior}^{2}}\right]\right) \\&= P \left( \frac{\hat{\Delta}-\mu_{\star}}{\sqrt{\hat{\sigma}_{\Delta}^{2}+\sigma_{\star}^{2}}}> \frac{\hat{\sigma}_{\Delta}^{2}\left[\left(\frac{1}{\sigma_{prior}^{2}} + \frac{1}{\hat{\sigma}_{\Delta}^{2}}\right)^{1/2} Z_{1-\alpha/2} - \frac{\mu_{prior}}{\sigma_{prior}^{2}}\right]-\mu_{\star}}{\sqrt{\hat{\sigma}_{\Delta}^{2}+\sigma^{2}_{\star}}}\right) \\&+ P\left(\frac{\hat{\Delta}-\mu_{\star}}{\sqrt{\hat{\sigma}_{\Delta}^{2}+\sigma_{\star}^{2}}}< \frac{-\hat{\sigma}_{\Delta}^{2}\left[\left(\frac{1}{\sigma_{prior}^{2}} + \frac{1}{\hat{\sigma}_{\Delta}^{2}}\right)^{1/2} Z_{1-\alpha/2} + \frac{\mu_{prior}}{\sigma_{prior}^{2}}\right]-\mu_{\star}}{\sqrt{\hat{\sigma}_{\Delta}^{2}+\sigma_{\star}^{2}}}\right) \\&= 1-\Phi\left(\frac{\hat{\sigma}_{\Delta}^{2}\left[\left(\frac{1}{\sigma_{prior}^{2}} + \frac{1}{\hat{\sigma}_{\Delta}^{2}}\right)^{1/2} Z_{1-\alpha/2} - \frac{\mu_{prior}}{\sigma_{prior}^{2}}\right]-\mu_{\star}}{\sqrt{\hat{\sigma}_{\Delta}^{2}+\sigma^{2}_{\star}}}\right) \\&+ \Phi\left(\frac{-\hat{\sigma}_{\Delta}^{2}\left[\left(\frac{1}{\sigma_{prior}^{2}} + \frac{1}{\hat{\sigma}_{\Delta}^{2}}\right)^{1/2} Z_{1-\alpha/2} + \frac{\mu_{prior}}{\sigma_{prior}^{2}}\right]-\mu_{\star}}{\sqrt{\hat{\sigma}_{\Delta}^{2}+\sigma_{\star}^{2}}}\right). \end{align*}

In practice GrowthBook assumes there is a true fixed effect size, i.e., the variance of the data generating process σ2\sigma_{\star}^{2} equals 0, and μ=Δ\mu_{\star} = \Delta, so two-sided power is

π=1Φ(σ^Δ2[(1σprior2+1σ^Δ2)1/2Z1α/2μpriorσprior2]Δσ^Δ2)+Φ(σ^Δ2[(1σprior2+1σ^Δ2)1/2Z1α/2μpriorσprior2]Δσ^Δ2).\begin{align} \begin{split} \pi &= 1-\Phi\left( \frac{ \hat{\sigma}_{\Delta}^{2}\left[\left(\frac{1}{\sigma_{prior}^{2}} + \frac{1}{\hat{\sigma}_{\Delta}^{2}}\right)^{1/2} Z_{1-\alpha/2} - \frac{\mu_{prior}}{\sigma_{prior}^{2}}\right]-\Delta} {\sqrt{\hat{\sigma}_{\Delta}^{2}}} \right) \\&+\Phi\left( \frac{ -\hat{\sigma}_{\Delta}^{2}\left[\left(\frac{1}{\sigma_{prior}^{2}} + \frac{1}{\hat{\sigma}_{\Delta}^{2}}\right)^{1/2} Z_{1-\alpha/2} - \frac{\mu_{prior}}{\sigma_{prior}^{2}}\right]-\Delta} {\sqrt{\hat{\sigma}_{\Delta}^{2}}} \right) . \end{split} \end{align}

We assume that σ2=0\sigma_{\star}^{2}=0 for simplicity and because large values of σ2\sigma_{\star}^{2} can result in negative MDEs (see here). If the prior variance σprior2\sigma_{prior}^{2} equals infinity then Equation 11 reduces to Equation 4.

Bayesian minimum detectable effect

MDEs are not well defined in the Bayesian literature. We provide MDEs in Bayesian power analysis for customers that are used to conceptualizing MDEs and want to be able to leverage prior information in their analysis.

We could define the MDE as the minimum value of μ\mu_{\star} such that at least π\pi power is achieved. This definition is Bayesian in that it permits uncertainty in the parameters in the data generating process. However, if σ2\sigma_{\star}^{2} is large, then there are some combinations of parameters where the MDE can be negative. That is, negative values of μ\mu_{\star} result in power being at least π\pi. Usually the inferential focus is the true treatment effect for the experiment (Δ\Delta), not the population mean from which Δ\Delta is just one realization (μ\mu_{\star}), so we set σ2=0\sigma_{\star}^{2} = 0 and consequently, Δ=μ\Delta = \mu_{\star}. This is why in practice we frame our Bayesian MDE as, ``given our prior beliefs and the data generating process, what is the probability we can detect an effect of size Δ\Delta?'', where Δ\Delta is a fixed number.

Another subtlety is that for a fixed sample size, Equation 11 can be decreasing in effect size, illustrated by Figure 1.

Figure 1

Metric Icon showing CUPED Disabled Reason

Figure 1 shows the case where the group sample sizes are 1500, the data mean is 0.1, the data variance is 0.5, the specified prior mean μprior=0.1\mu_{prior} = 0.1, the specified prior variance σprior2\sigma_{prior}^{2} is 0.3, and the variance of the data generating process σ2\sigma_{\star}^{2} is 0. Nominal 80% power occurs at 1.65, continues to increase in effect size until the effect size is about 3, and then begins decreasing. Power decreases in effect size because the variance in Equation 8 is quadratic in effect size, and in Equation 11 the term in front of Z1α/2Z_{1-\alpha/2} goes to infinity as Δ\Delta gets large. The coefficient in front of Z1α/2Z_{1-\alpha/2} in Equation 4 is 1, so frequentist power is increasing in effect size in all cases. Monotonicity does hold for Bayesian power for absolute effects, where the variance is not affected by the effect size.

Because power is not monotonic in effect size, we perform a grid search across effect sizes ranging from 0 to 500%. The derivative of Equation 11 is bounded in absolute value by 2ϕ(0)<0.82 \phi(0) < 0.8, where ϕ(.)\phi(.) is the density of the standard normal distribution.

Let the length of one grid cell equal ll. We evaluate power at the points {0,l,2l,...,5l,5}.\left\{0, l, 2l, ..., 5 -l , 5\right\}. Suppose the power at the kthk^{\text{th}} gridpoint is πk\pi_{k}, k>0k>0. Because 1) the maximum slope from the midpoint to the endpoint of the cell is no greater than 2ϕ(0)2 \phi(0); and 2) the maximum distance from where power is evaluated is l/2l/2, the maximum power in [(l1)k,lk]\left[(l-1)k, lk \right] is no greater than max(πk1,πk)+ϕ(0)l\max\left(\pi_{k-1}, \pi_{k}\right)+\phi(0)l. Motivated by this fact, we describe our approach in Algorithm 1.

In words, Algorithm 1 evaluates power at Δ={0,0.001,0.002,...5}\Delta=\left\{0, 0.001, 0.002, ... 5\right\}, until it finds the first element kk such that π(k)>=πϕ(0)l\pi(k) >= \pi - \phi(0)l. If power exceeds this threshold, then we evaluate a finer grid across the range from [kl,k][k-l, k], where the grid cell length is l<<ll' << l. We find the first element (if it exists) of this finer grid where power is at least π\pi. We return this first element as the solution if it exists; otherwise we keep searching the coarse grid.

Algorithm 1

  1. Define ll as the length between points at which power is evaluated (l=0.001 in production).
  2. Define the grid of points between 0 and 5 as G={0,l,2l,...,5l,5}.\mathcal{G} = \left\{0, l, 2l, ..., 5 -l , 5\right\}.
  3. Begin evaluating π(k)\pi(k) for kGk \in \mathcal{G}.
  4. If π(k)<πϕ(0)l\pi(k) < \pi - \phi(0)l for all kk, then the MDE does not exist. Otherwise:
  5. Find the first kGk \in \mathcal{G} such that π(k)>=πϕ(0)l\pi(k) >= \pi - \phi(0)l. Define ll' as a finer grid resolution (in production, l=l/100l' = l / 100). Find the first element of the set {k1,k1+l,k1+2l,...,kl,k}\left\{k -1, k-1+l', k-1+2l', ..., k-l', k \right\} such that power evaluated at that point is at least π\pi. If no such point exists, return to the coarser grid search in Step 3.

If power exceeds π+ϕ(0)l\pi + \phi(0)l' at any point in [0,5][0, 5], then Algorithm 1 is guaranteed to detect it. In practice we use l=103l=10^{-3} and l=105l'=10^{-5}.