Technical CUPED details
Here we document the technical details behind GrowthBook CUPED variance estimates.
We use the notation below. We describe our approach in terms of revenue, but any binomial or count metric can be substituted.
- Define YC (YT) as the observed post-exposure revenue for a user exposed to control (treatment).
- Define XC (XT) as the observed pre-exposure revenue for a user exposed to control (treatment).
- Define Y (X) as the post-exposure (pre-exposure) revenue for all users collectively in the experiment.
- Define YˉC (YˉT) as the sample average post-exposure revenue for users exposed to control (treatment).
- Define μC (μT) as the population average post-exposure revenue for users exposed to control (treatment).
- Define NC (NT) as the number of users exposed to control (treatment).
Absolute case
For absolute inference, our target parameter is
ΔA=μY−μYC.
As described in Equation 4 of (Deng et al. 2013), we find the optimal θ using user data across both control and treatment:
θ=cov(Y,X)/var(X).
Our estimate of ΔA is the difference in adjusted means
Δ^A=(YˉT−θXˉT)−(YˉC−θXˉC).
Under a superpopulation framework and independence of random assignment, the adjusted means (YˉT−θXˉT) and (YˉC−θXˉC) are statistically independent.
Therefore, the variance of the difference in adjusted means is the sum of the variances of the adjusted means.
We denote these variances as Vadj,C and Vadj,T, respectively, and they are defined as
Define the control (treatment) population covariance between post-exposure and pre-exposure revenue as σXY,C (σXY,T).
Vadj,CVadj,T=NCσYC2+θ2σXC2−2θσXY,C=NTσYT2+θ2σXT2−2θσXY,T.
Our estimated variance of Δ^A is σ^ΔA2=Vadj,C+Vadj,T.
Relative case
For relative inference (i.e., estimating lift), the parameter of interest is
ΔR=μCμT−μC.
Our estimate of ΔR is the difference in adjusted means divided by the control mean:
Δ^R=YˉC(YˉT−θXˉT)−(YˉC−θXˉC).
To derive σ^ΔR2, the estimated variance of Δ^R, we use the delta method.
- Define the control (treatment) population post-exposure variance as σYC2 (σYT2).
- Define the control (treatment) population pre-exposure variance as σXC2 (σXT2).
- Define the covariance of the sample control means ΛC=Cov[YˉC,XˉC]=(σY,C2σ∗XY,Cσ∗XY,CσX,C2)/NC.
- Define the covariance of the sample treatment means ΛT=Cov[YˉT,XˉT]=(σY,T2σ∗XY,Tσ∗XY,TσX,T2)/NT.
- Define the vector of population means β0=[μYT,μXT,μYC,μXC].
- Define their sample counterparts as β^=[YˉT,XˉT,YˉC,XˉC].
- Define Λ=Cov(β^)=(ΛT00ΛC),
where 0 is a 2×2 matrix of zeros.
By the multivariate central limit theorem:
β^∼MVN(β0,Λ).
For vector β, define its kth element as β[k].
Define the function
g(β;θ)=β[3](β[1]−θβ[2])−(β[3]−θβ[4]).
Define the vector of partial derivatives as ∇r=∂β∂g(β), where the individual elements are
∇[1]∇[2]∇[3]∇[4]=β[3]1=β[3]−θ=β[3]2−β[3]−(β[1]−θβ[2]−β[3]+θβ[4])=β[3]2−β[1]+θβ[2]−θβ[4]=β[3]θ.
By the delta method,
Δ^r=g(β^)∼N(Δr=g(β),∇r⊤Λ∇r).
Decompose ∇r into ∇r=[∇r[1:2],∇r[3:4]].
Then the final variance