Splunk Search

Separating data into buckets based on lists containing a specific value (and sum # of items in arrays)

ericwindmill
Observer

Howdy,


Basically, what I'm trying to achieve is putting all events into 2 buckets, based on the `tracking policies`, and then count the number of clicks for each bucket (so the sum of events from each event within the buckets). There is example event data below.

Disclaimer: This data isn't real, it just follows the same structure that I'm using IRL. So there may be some typos/incorrect naming/wonkiness.

 

event 1
name: Guy
address: 1234 Street Ave
age: 32
additional_info (json): "{
    "tracking_policies": ["no track", "light tracking",]
}"
tracking (json): "{
    user_interactions: "{
        "clicks": ["click1", "click2"]
    }
}"

 

So, I want to find the answer to two questions:
1. For users that have consented to tracking (users that _don't_ have "no track" policy present in the list), what is the average number of "clicks" they're making?
2. For users that have "no track" policy in their additional_info.tracking_policies list, I want to see how many clicks they've made. (Really, what I'm doing is confirming that this number is zero.)

So far, this is my query:

 

base search
| spath input=user_interactions output=clicks path=tracking{}.user_interactions{}.clicks{}
| spath input=additional_info output=tracking_policies path=tracking_policies{}
| eval has_no_tracking=if("tracking_policies"="no track", 1, 0)
| eval has_tracking=if(ad_strategy!="no track", 1, 0) 
| stats count AS event_count, 
        count(has_no_tracking) as has_no_tracking_count
        count(has_tracking) as has_tracking_count
        count(clicks) AS total_clicks
| eval avg_clicks=total_clicks/has_no_tracking_count
| table event_count avg_clicks

 

 

These are the problems I'm having so far:
1. I have no idea if the booleans are "correct". What I'm trying to do is separate events into to buckets by "list contains no track" and everything else
2. avg_clicks isn't correct because it assumes that all clicks are from tracked accounts. total_clicks needs to be compared to something like `clicks_from_tracked_account`, which I'm struggling to get.
3. Similarly, I'm struggling to get the metric like `clicks_from_untracked_accounts` (which should be 0, if our application is working correctly)

The final table I'm looking for would look like this:

 

table event_count total_clicks clicks_from_tracked_account avg_clicks_from_tracked_account clicks_from_untracked_accounts

 

I know that this is not a specific question, and it may be too broad to be asked. If that's the case, just helping me achieve "clicks_from_tracked_account" would be a massive help.

Labels (3)
0 Karma
Get Updates on the Splunk Community!

The All New Performance Insights for Splunk

Splunk gives you amazing tools to analyze system data and make business-critical decisions, react to issues, ...

Good Sourcetype Naming

When it comes to getting data in, one of the earliest decisions made is what to use as a sourcetype. Often, ...

See your relevant APM services, dashboards, and alerts in one place with the updated ...

As a Splunk Observability user, you have a lot of data you have to manage, prioritize, and troubleshoot on a ...