Query Optimization

GrowthBook is designed to run efficient database queries out-of-the-box, but for large datasets or complex metrics, there are a few settings and techniques you can use to optimize your queries.

SQL Template Variables

GrowthBook always includes a date filter in the SQL queries it generates, but because the queries are complex with multiple nested subqueries, database engines are not always able to fully take advantage of these filters.

If you have a date-partitioned table, you can use template variables within the SQL you enter in GrowthBook to provide better hints to your database. This applies to metrics, fact tables, and experiment assignment queries.

Here's an example of a simple Fact Table definition that uses template variables:

SELECT
  timestamp,
  user_id,
  amount
FROM purchases
WHERE
  timestamp BETWEEN '{{startDate}}' AND '{{endDate}}'

The following variables are available:

startDate - yyyy-MM-dd HH:mm:ss of the earliest data that needs to be included
startDateISO - yyyy-MM-dd'T'HH:mm:ss.SSS'Z' of the startDate in ISO format
endDate - yyyy-MM-dd HH:mm:ss of the latest data that needs to be included
endDateISO - yyyy-MM-dd'T'HH:mm:ss.SSS'Z' of the endDate in ISO format

There is also a date helper you can use with the ISO variables to format the date exactly how you need it. For example {{date endDateISO "yyyyMMdd"}}

code	meaning
yyyy	year
MM	month
dd	day
HH	hour
mm	minutes
ss	seconds
t	timestamp

Here's a full example of this being used:

SELECT
  user_id as user_id,
  received_at as timestamp
FROM
  orders
WHERE
  partition_key BETWEEN
    '{{date startDateISO "yyyyMMdd"}}' AND '{{date endDateISO "yyyyMMdd"}}'

note

The inserted values do not have surrounding quotes, so you must add those yourself (e.g. use '{{ startDate }}' instead of {{ startDate }}).

Fact Tables

Fact Tables are a shared SQL definition that can be re-used across multiple metrics. For example, a "Purchases" fact table could be used for both a "Total Revenue" metric and a "Items per Order" metric.

If you are still using legacy metrics (where each metric has its own separate SQL definition), you are missing out on important query optimizations and the newest features in GrowthBook.

Read more about Fact Tables and how to convert legacy metrics to Fact Tables.

Fact Table Query Optimization

GrowthBook Enterprise customers can enable Fact Table Query Optimization for faster, more efficient queries.

If multiple metrics from the same Fact Table are added to an experiment, they will be combined into a single SQL query. For data sourcees with usage-based billing, this can result in dramatic cost savings.

There are some restrictions that limit when this optimization can be performed:

Ratio metrics where the numerator and denominator are part of different Fact Tables are always excluded from this optimization
If Exclude In-Progress Conversions is set for an experiment, optimization is disabled for all metrics
If you are using MySQL and a metric has percentile capping, it will be excluded from optimization

In all other cases, this optimization is enabled by default for all Enterprise customers. It can be disabled under Settings → General → Experiment Settings. When disabled, a separate SQL query will always be run for every individual metric.

Data Pipeline Mode

GrowthBook Enterprise customers can enable Data Pipeline Mode to reduce the amount of duplicate data your warehouse needs to scan.

When enabled, GrowthBook will write some intermediate tables back to your warehouse with short retention and re-use those across all of the metric queries in an experiment.

Currently, this is limited to BigQuery, Snowflake, and Databricks, but we are working on adding support for other data sources soon.

Materialized Views

If your metric definitions are complex and involve multiple joins or subqueries, you may want to consider creating a materialized view in your warehouse.

Setting up materialized views differs by warehouse, so consult the documentation for your specific warehouse for more information.

You can also use a tool like dbt to create computed tables that are automatically refreshed on a schedule.

Pre-Aggregated Tables

Pre-aggregated tables include a GROUP BY in the data pipeline to compress raw event-level data down to fewer rows. This is usually done when querying raw data directly is prohibitively expensive.

GrowthBook supports pre-aggregated tables as long as they satisfy two requirements:

Must be grouped by both user and date
Pre-aggregated columns can only be basic sums or counts. No averages, percentiles, count distinct, or complex derived formulas that break statistical assumptions.

Using pre-aggregated tables for experimentation comes with additional complexities and downsides, so we highly recommend sticking with event-level data whenever possible.

Read more about this approach with some examples and best practices.

SQL Template Variables​

Fact Tables​

Fact Table Query Optimization​

Data Pipeline Mode​

Materialized Views​

Pre-Aggregated Tables​