Geo Tests: Reading Your Results

Getting Started

After your geo test has completed and your results have been finalized, you’ll get comprehensive results detailing all of the test’s findings.

This article covers interpreting your geo test results.

Click here to read about the concepts and terms involved with geo testing.

Click here for how-to instructions on setting up your own geo test.

Accessing Your Results

When your test’s results are ready, you can view them by going to My Tests section in the bottom of the main Geo Tests page and clicking the Finished tab.

Here, you will see an archive of all of your completed geo tests, alongside some high-level details. Clicking either the test’s name or the Detail arrow at the right of its row will take you to a more detailed view.

Note: If your test has ended but the results are still being processed, it will still appear under the Active tab with the label Results Pending.

The Results Page

The in-depth review of the findings of your test contains multiple sections, along with the conditions of the test along the top for your reference. This information can also be downloaded as PDF via the link at the bottom-right of the window.

Business Impact

The first section of your results is Business Impact, which shows details about primary metrics that your test has uncovered. It is divided into three main sections, along with a recommendation for how to optimize your future budgeting based on the test’s findings.

Conversions

In the first panel, you’ll see the actual conversions that the test yielded versus the amount our data science anticipated via its prediction model.

  • Actual results are what we know to have happened because test conditions were implemented. These come from the transaction data provided by your integrated platforms (i.e. Shopify). It's important to note that we count all transactions from this data, not just what your platform is reporting to you.
  • Predicted results are what can be assumed to have happened if test conditions were not implemented. These are based on the past two years of data for both your conversions and the tested markets. Lasso regression modeling is used to predict the conversions for the markets in your test by basing them on the sales patterns of the initial markets your test was modeled on. This lets us accurately project what would continue to happen under regular circumstances.

    Click here for more on how markets are selected and modeled for your geo tests.

For a holdout test, if predicted results are higher than the observed results, this proves the channel is having an incremental impact. Customers did not see your media, so they did not make a purchase.

Scale tests are the exact opposite. If observed results are higher than predicted results, that’s when we know a channel is making an impact. If predicted results are higher, the results are inconclusive.

Confidence Rating

Note: This feature only applies to tests that concluded after April 18th, 2024.

Your test results will include a confidence rating. This number shows the chance that you would get similar results if you ran you test again under similar conditions. Essentially, it's how likely it is that your results are due to the conditions of the test instead of random chance.

The higher the number, the more likely it is that test conditions (e.g. market selection, withheld media, scaled budgets) led to the results you received. Here are our general guidelines for interpreting confidence ratings:

  • Higher than 80%: high confidence
  • 65% to 80%: medium confidence
  • Lower than 65%: more investigation is required, and the test may need to be run again

Measured uses industry-standard methods to draw these conclusions. For any test, there are initially two different scenarios that are run against each other:

  1. A baseline hypothesis that assumes there will be no positive lift from the test
  2. A positive hypothesis that assumes test conditions will cause an increase in lift

The goal is to prove the baseline hypothesis wrong. For each test, our platform reports the probability that the results we found would still happen under baseline conditions. The lower that probability is, the less likely the results are due to random chance.

The confidence rating you see with your results is calculated as 1 minus that probability rating. For instance, if a 3% probability was found that your results would still have occurred without the test, it would be 1 minus 3%, or a confidence rating of 97%.

Incremental ROAS and CPO

In the third panel, your test results are summarized in a key high-level metric: incremental ROAS or CPO. You can switch between the two via the toggle in the upper-right of the Business Impact section.

The main metric you see here is based on the incrementality percentage found in the test. Below that, you’ll see the test’s metric versus what was arrived at via the previous incrementality percentage.

It is important to take this updated metric within the context of your entire portfolio. Note that the Media Plan Optimizer will take your new test results into account and automatically help you create a better budget plan across your tactics for future spending.

Spending

The Spending section lays out a simple view of how much your test cost to run.

  • Holdout tests will always have a scale cost of zero, since they are based on withholding media instead of testing an increased budget.
  • Scale tests will show the amount that budgets were increased to test the effectiveness of higher spending for your tactic.

How It Works

To see details on how your test results were calculated, the How It Works section gives thorough breakdowns of key factors.

Test Adjustment Factor

The top section shown above walks you through the math of how your test’s incrementality was determined. This is applied to the non-incremental versions of your metrics to produce the true, accurate number.

There are multiple steps to this equation:

  1. First, contribution is determined as a percentage by taking the number of orders initially predicted in test markets, then adding or subtracting the actual orders the test observed. That result is then divided by the initial number predicted in test markets.
  2. Then, incremental conversions are found by multiplying the total actual orders across untested markets by the established contribution percentage.
  3. Finally, test incrementality is determined by dividing the incremental conversions by the vendor orders across all markets, whether or not they were part of the test.

Since the test markets are precisely matched with similar markets where test conditions weren't implemented, we know the contribution in your results is representative of these untested markets as well.

In most cases, you'll see incrementality below 100%, meaning your integrated platform has over-reported your true number of conversions. If it is above 100%, the platform is instead under-reporting.

Actual vs Predicted Graph

Below the incrementality equation, you’ll see a visual breakdown of the actual versus predicted summary shown in the Business Impact section.

This graph shows how actual conversions lined up with our predictions over the duration of the test. You can interact with the graph by hovering over a point to see the conversions from that time period.


How did we do?


Powered by HelpDocs (opens in a new tab)