Configuring GrowthBook

The default configuration in GrowthBook is optimized for trying things out quickly on a local dev machine. Beyond that, you can customize behavior with Environment Variables and optionally a config yaml file.

Environment Variables

The GrowthBook docker container supports a number of environment variables:

Domains and Ports

This is used to generate links to GrowthBook and enable CORS access to the API.

  • APP_ORIGIN - default http://localhost:3000
  • API_HOST - default http://localhost:3100

If you need to change the ports on your local dev environment (e.g. if 3000 is already being used), you need to update the above variables AND change the port mapping in docker-compose.yml:

growthbook:
  ports:
    - "4000:3000" # example: use 4000 instead of 3000 for the app
    - "4100:3100" # example: use 4100 instead of 3100 for the api

Production Settings

  • NODE_ENV - Set to "production" to turn on additional optimizations and API request logging
  • MONGODB_URI - The MongoDB connection string
  • JWT_SECRET - Auth signing key (use a long random string)
  • ENCRYPTION_KEY - Data source credential encryption key (use a long random string)

Email SMTP Settings

This is required in order to send experiment alerts and team member invites.

  • EMAIL_ENABLED ("true" or "false")
  • EMAIL_HOST
  • EMAIL_PORT
  • EMAIL_HOST_USER
  • EMAIL_HOST_PASSWORD
  • EMAIL_USE_TLS ("true" or "false")

Google OAuth Settings

Only required if using Google Analytics as a data source

  • GOOGLE_OAUTH_CLIENT_ID
  • GOOGLE_OAUTH_CLIENT_SECRET

S3 File Uploads

Optional. If not specified, GrowthBook will use the local file system to store uploads.

  • UPLOAD_METHOD (set to "s3" instead of default value "local")
  • S3_BUCKET
  • S3_REGION (defaults to us-east-1)
  • S3_DOMAIN (defaults to https://${S3_BUCKET}.s3.amazonaws.com/)
  • AWS_ACCESS_KEY_ID (not required when deployed to AWS with an instance role)
  • AWS_SECRET_ACCESS_KEY (not required when deployed to AWS with an instance role)

Other Settings

  • EXPERIMENT_REFRESH_FREQUENCY - Automatically update experiment results every X hours (default 6)
  • DEFAULT_CONVERSION_WINDOW_HOURS - How many hours after being put into an experiment does a user have to convert. Can be overridden on a per-metric basis. (default 72)
  • DISABLE_TELEMETRY - We collect anonymous telemetry data to help us improve GrowthBook. Set to "true" to disable.

Config.yml

In order to use GrowthBook, you need to connect to a data source and define metrics (and optionally dimensions). There are two ways to do this.

The default way to define these is by filling out forms in the GrowthBook UI, which persists them to MongoDB.

The other option is to create a config.yml file. In the Docker container, this file must be placed at /usr/local/src/app/config/config.yml. Below is an example file:

datasources:
  warehouse:
    type: postgres
    name: Main Warehouse
    # Connection params (different for each type of data source)
    params:
      host: localhost
      port: 5432
      user: root
      password: ${POSTGRES_PW} # use env for secrets
      database: growthbook
    # How to query the data (same for all SQL sources)
    settings:
      queries:
        experimentsQuery: >
          SELECT
            user_id,
            anonymous_id,
            received_at as timestamp,
            experiment_id,
            variation_id
          FROM
            experiment_viewed
        pageviewsQuery: >
          SELECT
            user_id,
            anonymous_id,
            received_at as timestamp,
            path as url
          FROM
            pages
metrics:
  signups:
    type: binomial
    name: Sign Ups
    datasource: warehouse
    sql: SELECT user_id, anonymous_id, received_at as timestamp FROM signups
dimensions:
  country:
    name: Country
    datasource: warehouse
    sql: SELECT user_id, country as value from users

Data Source Connection Params

The contents of the params field for a data source depends on the type.

As seen in the example above, you can use environment variable interpolation for secrets (e.g. ${POSTGRES_PW}).

Redshift, ClickHouse, Postgres, and Mysql (or MariaDB)

type: postgres # or "redshift" or "mysql" or "clickhouse"
params:
  host: localhost
  port: 5432
  user: root
  password: password
  database: growthbook
  ssl: false # ssl setting only works for postgres and redshift currently

Snowflake

type: snowflake
params:
  account: abc123.us-east-1
  username: user
  password: password
  database: GROWTHBOOK
  schema: PUBLIC
  role: SYSADMIN
  warehouse: COMPUTE_WH

BigQuery

BigQuery access requires a service account.

type: bigquery
params:
  projectId: my-project
  clientEmail: growthbook@my-project.iam.gserviceaccount.com
  privateKey: -----BEGIN PRIVATE KEY-----\nABC123\n-----END PRIVATE KEY-----\n

Presto and TrinoDB

type: presto
params:
  engine: presto # or "trino"
  host: localhost
  port: 8080
  username: user
  password: password
  catalog: growthbook
  schema: growthbook

AWS Athena

type: athena
params:
  accessKeyId: AKIA123
  secretAccessKey: AB+cdef123
  region: us-east-1
  database: growthbook
  bucketUri: aws-athena-query-results-growthbook
  workGroup: primary

Mixpanel

Mixpanel access requires a service account.

type: mixpanel
params:
  username: growthbook
  secret: abc123
  projectId: my-project

Google Analytics

Unfortunately at this time there is no way to connect to Google Analytics in config.yml. You must connect via the GrowthBook UI, where we use OAuth and a browser redirect.

Data Source Settings

The settings tell GrowthBook how to query your data.

SQL Data Sources

For data sources that support SQL, there are 2 queries you need to define plus an optional Python script to run queries from inside a Jupyter notebook:

type: postgres
params: ...
settings:
  queries:
    # This query returns experiment variation assignment info
    # One row every time a user was put into an experiment
    experimentsQuery: >
      SELECT
        user_id,
        anonymous_id,
        received_at as timestamp,
        experiment_id,
        variation_id
      FROM
        experiment_viewed
    # This query returns one row for every page view on your site
    # It is used to predict running times before you start an experiment
    pageviewsQuery: >
      SELECT
        user_id,
        anonymous_id,
        received_at as timestamp,
        path as url
      FROM
        pages
  # Used when exporting experiment results to a Jupyter notebook
  # Define a `runQuery(sql)` function that returns a pandas data frame
  notebookRunQuery: >
    import os
    import psycopg2
    import pandas as pd
    from sqlalchemy import create_engine, text

    # Use environment variables or similar for passwords!
    password = os.getenv('POSTGRES_PW')
    connStr = f'postgresql+psycopg2://user:{password}@localhost'
    dbConnection = create_engine(connStr).connect();

    def runQuery(sql):
      return pd.read_sql(text(sql), dbConnection)

Mixpanel

Mixpanel does not support SQL, so we query the data using JQL instead. In order to do this, we just need to know a few event names and properties:

type: mixpanel
params: ...
settings:
  events:
    experimentEvent: Viewed Experiment
    experimentIdProperty: experiment_id
    variationIdProperty: variation_id
    pageviewEvent: Page View
    urlProperty: url

Metrics

Metrics are what your experiments are trying to improve (or at least not hurt).

Below is an example of all the possible settings with comments:

name: Revenue per User
# Required. The data distribution and unit
type: revenue # or "binomial" or "count" or "duration"
# Required. Must match one of the datasources defined in config.yml
datasource: warehouse
# Description supports full markdown
description: This metric is **super** important
# If true, conversions are counted up to 30min before a user is put into an experiment
earlyStart: false
# For inverse metrics, the goal is to DECREASE the value (e.g. "page load time")
inverse: false
# When ignoring nulls, only users who convert are included in the denominator
# Setting to true here would change from "Revenue per User" to "Average Order Value"
ignoreNulls: false
# Which types of users are support. Logged-in (user), logged-out (anonymous), or both
userIdType: "user" # or "anonymous" or "either"
# Any user with a higher metric amount will be capped at this value
# In this case, if someone bought a $10,000 order, it would only be counted as $100
cap: 100
# Hours the user has to convert on this metric after being put in an experiment
conversionWindowHours: 72
# The risk thresholds for the metric.
# If risk < $winRisk, it is highlighted green.
# If risk > $loseRisk, it is highlighted red.
# Otherwise, it's highlighted yellow.
winRisk: 0.0025
loseRisk: 0.0125
# Min number of conversions for an experiment variation before we reveal results
minSampleSize: 150
# The "suspicious" threshold. If the percent change for a variation is above this,
#   we hide the result and label it as suspicious.
# Default 0.5 = 50% change
maxPercentChange: 0.50
# The minimum change required for a result to considered a win or loss. If the percent
# change for a variation is below this threshold, we will consider an otherwise conclusive
# test a draw.
# Default 0.005 = 0.5% change
minPercentChange: 0.005
# Arbitrary tags used to group related metrics
tags:
  - revenue
  - core

In addition to all of those settings, you also need to tell GrowthBook how to query the metric.

SQL Data Sources

For SQL data sources, you just need to specify a single query. Depending on the other settings, the columns you need to select may differ slightly:

  • timestamp - always required
  • user_id - required unless userIdType is set to "anonymous"
  • anonymous_id - required unless userIdType is set to "user"
  • value - required unless type is set to "binomial"

A full example:

type: duration
userIdType: either
sql: >
  SELECT
    created_at as timestamp,
    user_id,
    anonymous_id,
    duration as value
  FROM
    requests

And a simple binomial metric that only supports logged-in users:

type: binomial
userIdType: user
sql: SELECT user_id, timestamp FROM orders

Mixpanel

For Mixpanel, instead of SQL we just need an event name, value property (for non-binomial metrics), and optional conditions.

A full example:

type: count
# The event name
table: PDF Downloads
# The numeric property to count (optional)
column: num_pages
# Filter the event by properties
conditions:
  - column: language # property
    operator: "=" # "=", "!=", ">", "<", "<=", ">=", "~", "!~"
    value: en

And a simple binomial metric:

type: binomial
# The event name
table: Purchased

Dimensions

Dimensions let you drill down into your experiment results. They are currently only supported for SQL data sources.

Dimensions only have 3 properties: name, datasource, and SQL. The SQL query must return two columns: user_id and value.

Example:

name: Country
# Must match one of the datasources defined in config.yml
datasource: warehouse
sql: SELECT user_id, country as value FROM users

Organization Settings

In addition to data sources, metrics, and dimensions, you can also control some organization settings from config.yml.

Below are all of the currently supported settings:

organization:
  settings:
    # Enable creating experiments using the Visual Editor (beta).  Default `false`
    visualEditorEnabled: true
    # Minimum experiment length (in days) when importing past experiments. Default `6`
    pastExperimentsMinLength: 3
    # Number of days of historical data to use when analyzing metrics
    # (must be between 1 and 400, default `90`)
    metricAnalysisDays: 90