Adobe Analytics and time between funnel steps

Recently I had to deal with the question “How much time it takes for a customer to go through funnel steps?” Simple question on the face of it. But the more I was thinking about it, the more complicated it was becoming. So what is the best way to measure and report using Adobe Analytics the time required for your visitors to go through different funnel steps?

The main requirement that I had to meet was something among the lines below:

Requirements
Requirements

I am going to demonstrate a few alternatives approaches to answering this question. Depending on your resources, setup and requirements you have a few options to consider:

  1. Adobe Analytics out-of-the-box report
  2. Adobe Analytics plugin
  3. Data collection customised solution
  4. Adobe Data Feeds and SQL

1) Adobe Analytics Out of the box report

The easiest way to answer such a question would be to use the Adobe Analytics default report “Time prior to event”. It calculates the amount of time required for a success event to occur within a visit. However the report is not really practical because the timing values are returned as dimension values (and grouped in buckets). This is not really useful if you want to have a trended analysis i.e. monitor the over-time performance of customers performing a certain action. If this was possible, this would be the best combo of ease-of-implementation versus utility solution.

Time Prior to Event report
Time Prior to Event sample

2) Adobe Analytics plugin

There is a plugin to evaluate the differenc between two success events; “getTimeToComplete“. On the surface it looks nice solution. However it is hard to ignore some key limitations:

  1. It calculates the difference between two events only. You have to code different logic for different time difference events. I.e. one block for “Success Event 12” – “Success Event 10”, different for “Success Event 40” – “Success Event 31”.
  2. Suggests to store the value in a dimension, good for classification but not so good for trended analysis.

There is always the option to modify the plugin and customise it but if you are willing to do down that path, please read below!

3) Data collection custom solution

This is where things can get complicated! The more I was thinking the process, the more I was coming up with caveats that needed to be addressed. The initial success events and prop configured would be:

  • 1 success event will capture the time difference in seconds from the “Start” of the journey; numeric event
  • 1 success event will capture the the total number of instances a measurement was recorded
  • 1 prop will capture the current step i.e. “step 1”, “step 2”, “end”. This will be utilised later on for hit-based segmentation to isolate the measurements for each individual step

Now let’s discuss the considerations:

  • Each measurement would need to be captured once per a logical time frame (or session). This is required to avoid counting multiple times cases where users go through certain steps of the process multiple times.
  • If the user comes back to the site after that logical time frame, we would like to capture the measurements.
  • Need to define explicitly all the steps of the process in order to adjust the data collection process. That has the inherent limitation of no historical data and low flexibility.

In order to deal with that, I had to use a concept called journey. The journey has an explicit starting point and fixed steps. We need to know those in order to calculate the various time differences. A journey:

  1. Will start once and has a certain duration.
  2. During that time period, it cannot restart
  3. During the same time period, each measurement will be captured only once.
  4. Every time the user goes through the steps, the journey is being checked if it has expired. If yes, then it is eligible to be restarted and all subsequent steps to be measured.

As you can see, there are already a lot of constraints and limitations. But let’s see how this looks visually. The below highlights the basic steps that a user would have to go through. Once the “Registration Start” step is triggered, the journey is being initiated and the expiration is set 30 minutes (or any time frame that works for you) in the future. Then each step is recorded only once during that journey.

Journey
Journey

In order to achieve that, the below code was required. To be honest, the measurements were surprisingly good but as you can see from the code the implementation is far from straight-forward. All the logic, considerations and constraints had to be implemented. Have a look:

But how the results looked like? Is this something that can be used? Actually it can; with some limitations! Since all the measurements are captured using 2 success events and a prop, we need some segmentation to get the numbers. Particularly, to calculate the time required to get from the “start” of the journey to each step we need do as follows. First we calculate the average time since the start of registration.

Calculated metric
Calculating the average time since the journey start

After that, we segment on hit-level using the prop previously mentioned. This will isolate the measurements for the time required to reach each step.

Reporting configuration of metrics
Reporting configuration

The above reporting configuration can lead to a visualisation like below.

Trended reporting of time measurements
Trended view of “Time Since Start” for different steps.

Unfortunately the above calculations are based on the mean value. That means that they are very sensitive to extremely high measurements, and there are a lot. In summary:

  1. No historical data
  2. Measurements heavily depend on data collection
  3. No flexibility for custom funnels
  4. High sensitivity to high values

The initial approach could provide some good basis for monitoring the time differences between certain actions of interest. However all the work above seems like a bit too much. Let’s explore an alternative!

4) Adobe Analytics Data Feeds and SQL

Having mentioned all the methods above, they feel limiting versus my preferred option. Working with Adobe Data Feeds using SQL! I have been arguing in favour of working with Data Feeds for some time now. The are very flexible and will open to you a whole new world to explore. In order to find out more details about loading data and working in BigQuery, please visit my previous articles on it:

Let’s start with the ideal result first, what we really need! The table below provides indicative measurements where we compare the Mean and Median time required to reach each registration step from the first step. Additionally we can see the median values are much lower highlight a previously mentioned limitation; average values are highly sensitive to extreme measurements. PS: Numbers are not real, I fiddled with them. nevertheless the principle still applies.

Funnel StepMeanMedian
registration: step 2852494
registration: step 31346935
registration: step 415491080
end21011380

But how do we get there? We will break down some key steps. The GitHub Gist below is organised in 4 separate SQL queries, each one meant to be executed on the back of the results of the previous one.

Let’s analyse them one by one.

  • 1_filterRows.sql: This is based on your implementation by the key idea is we want to isolate the rows that each individual step is being recorded. You might need to filter based on a different prop, eVar or event. However you always need the visitor ID, visit number, step name and timestamp
  • 2_calculateStepStart.sql: Grouping by visitor ID and visit number in order to obtain the earliest timing that each step of the journey took place
  • 3_calculateTimeDifferences.sql: Calculate the time difference for each individual step since the earliest measurement per visit & visitor ID
  • 4_calculateMeasurements.sql: Calculate per funnel step the median and mean values

Hope you have found the overview helpful! Given you have the process in place to load Adobe Analytics Data Feeds into a database, you are able to generate the same results much easier.

As I have many times re-iterated, working with the raw clickstream data is the way forward for the analysts! Have fun!