Adobe Analytics VISTA Rules data issues

VISTA rules stand for “Visitor Identification, Segmentation & Transformation Architecture”. They are helpful for a number of reasons within Adobe Analytics and it is a very old feature of Adobe Analytics (This post is from 2008! -> https://blog.adobe.com/en/publish/2008/11/30/vista-inside-omniture-sitecatalyst). The linked post should elaborate quite well their capabilities. Due to the age and flexibility of the features, debugging Adobe Analytics VISTA Rules data issues can be difficult and time-consuming exercise.

Traditionally Adobe Engineering is configuring them based on clients’ requirements. Also this is a paid feature. As a feature, customers cannot view the functionalities and configurations are and you have to rely on Adobe to provide you a document outlining them. However overtime and as staff rotates roles, it is hard to maintain historical knowledge/documentation – even worse – forget that they even exist. So make sure to check with Client Care when picking up a new account.

Today I will add one more pain in the challenges working with VISTA rules. If you have to deal with cases where you have random orders or checkout starts appearing in your data in random pages, then this might be relevant to you. Hopefully you will not have to deal with this issue but in case it helps you, it will worth spending 5 minutes to understand the issue.

How Adobe Analytics VISTA rules work

VISTA rules are part of the overall processing pipeline that takes place within Analytics. First data is getting collected. Then the raw data is processed by Processing Rules (as configured within the Admin console). Next VISTA rules take effect and modify the data. Finally Marketing Channels are processed. Lastly the data ends up in the final data processing step where is stored, aggregated and available for reporting (rough picture, probably oversimplified!).

Analytics data processing pipeline
Analytics Processing Order

In simple terms they modify the data and their results are visible in the final reported figures. If you are familiar with raw data feeds, this is like saying:

  • non-post columns in data feeds are VISTA rules
  • post columns in data feeds are AFTER VISTA rules

Technically the above is correct for Processing Rules and Marketing Channel Processing Rules.

For example, if you have a VISTA rule that sets evar10 = ‘shop’ while it is empty during data collection and not modified by processing rules, you will observe the following behaviour in the data feeds:

  • evar10 column will be empty
  • post_evar10 column will be set to ‘shop’
  • event_list column will NOT contain 109
  • post_event_list column WILL contain 109

When examining the data feeds contents, there is an event look up file which helps decode the value 109. 109 stands for Instances of eVar10.

event lists references - evar1-10
Event lists references – eVars 1-10

How this can create data issues

To generalise, for every eVar that you set (either in data collection or through processing/VISTA rules), an equivalent number will be set in event_list (if set through data collection) and post_event_list (regardless of method).

VISTA rules have by default a limit in the number of characters they can process when they modify the event_list. The magic number is 260 characters. If the event_list is modified by a VISTA rule and there are more characters, then the values get be truncated resulting in unexpected results!

The situation becomes more complicated when you enter the area of eVar100-eVar250. Suddenly the numbers added to event_list become 5-digit.

event lists references - evar101-110
Event lists references – eVars 101-110

So if you have – let’s say – 45 eVars in the range from eVar101 to eVar250 populated, then the total size of the event_list column can easily exceed the threshold of 260 chars. 45 eVars might sound a lot, but some times is not!

As a result of the above, the numeric values above will truncate creating funny results. From my experience, most commonly the below metrics were reported while not being true:

  • 1 -> Purchase Event
  • 10 -> Cart Open
  • 11 -> Checkout

Unfortunately that means that the above metrics can be reported in unexpected places. Given the importance of them, this can easily be very frustrating and lead to loss of trust in the data and analysis.

Identifying potential issues

The quick way to spot this would be if you were experiencing data collected in unexpected places. This can be Orders or Checkout Start reported in irrelevant pages i.e. listing or product pages.

Alternatively if you have access to the data feeds raw data, it can be easier to diagnose this issue by writing a small SQL code. The below will just return a few rows where orders appear post-processing but are not part of the data collection.

SELECT event_list, post_event_list
FROM [table name]
where concat(',',event_list,',') not like '%,1,%'
and concat(',',post_event_list,',') like '%,1,%'
limit 5

Please keep in mind that if you populate orders through Processing Rules or VISTA rules, those orders will still be returned by the query (because they appear in the final processing but are not present in the data collection). So it’s important to be 100% certain that you are tracking Orders (i.e. purchase event) as part of the data collection.

If you have spotted concerning data, you can quantify their extend by running the below. It will get you by date, the volume of total events recorded and the volume of orders reported but were not actually part of the data collection process.

SELECT 
date_time,
count(event_list) as total_events,
sum(
case
when (
concat(',', event_list, ',') not like '%,1,%'
and concat(',', post_event_list, ',') not like '%,1,%'
) then 0
when (
concat(',', event_list, ',') not like '%,1,%'
and concat(',', post_event_list, ',') like '%,1,%'
) then 1
end
) as fake_order
FROM [table name]
where date_time > date '2021-09-01'
and concat(',', event_list, ',') not like '%,1,%'
group by 1
order by date_time asc

Finally, if you want confirm in your setup that the 260-character rule applies, you can execute the below. It will output the volume of “fake orders” broken down by the length of the event list. So if you observe high volume of them in the bracket of 250-270, you should be concerned. At this moment I am not sure why the values fluctuate but the premise is still the same.

with cte as (
SELECT event_list,
post_event_list,
length(event_list) as event_list_count,
case
when (
concat(',', event_list, ',') not like '%,1,%'
and concat(',', post_event_list, ',') not like '%,1,%'
) then 'clean_hit'
when (
concat(',', event_list, ',') not like '%,1,%'
and concat(',', post_event_list, ',') like '%,1,%'
) then 'fake_order'
end as hit_outcome
FROM [table name]
where date_time > date '2021-09-01'
and concat(',', event_list, ',') not like '%,1,%'
)
select event_list_count,
hit_outcome,
count(event_list) as event_list_list_volumes
from cte
where hit_outcome = 'fake_order'
group by 1, 2
order by event_list_count asc

Resolving the issues

This is where it becomes difficult to resolve such Adobe Analytics VISTA Rules data issues. Generally I would suggest to raise this with your account manager. I have spent countless of hours discussing this with Adobe Client Care and then subsequently with Adobe Consulting. Technically this is a limitation of the feature so they should be able to support on it. Hopefully you will get in touch with a helpful person within Adobe to assist you in resolving this free of charge.

Eventually in my case they were able to raise the characters limit that VISTA rules had in processing the event_list and the issue just went away. No more incorrect data!

Hopefully this will help at least one of you out there and save you from all the energy and time required to investigate this and explain to Adobe the situtation.