Splunk Search

Assigning a max value from one field as a new field

redc
Builder

I am attempting to write a search that creates arbitrary "buckets" for qualifying events using a numeric code (1-5). For this particular search, I'm calculating the number of orders that individuals placed during a specific window of time, then I need to take the maximum number of orders by ANY of the individuals, divide that by 5, and then assign a classification to all of the individuals based on how many orders they placed.

Example:

User Orders Code
   1      2    1
   2      5    3
   3     10    5
   4      4    2

Because of the format that the data is in, finding out how many orders each user has is really complicated and I really don't want to have to run it twice.

To get the largest number of orders in the dataset, I run:

... | stats max(Orders) as MaxOrders

Unfortunately, once I use a stats command, I can't go back and run calculations against the dataset without doing a join or append search that would run ALL the data through the calculations to find out how many orders they have all over again.

I've tried using appendpipe:

... | appendpipe [ stats max(Orders) as MaxOrders ]

This creates a whole new event where everything is null except the "MaxOrders" column. But then I need create buckets (something like this):

... | eval Buckets=(MaxOrders/5) | 
eval Bucket1=Buckets | 
eval Bucket2=(Bucket1+Buckets) | 
eval Bucket3=(Bucket2+Buckets) | 
eval Bucket4=(Bucket3+Buckets) | 
eval Bucket5=(Bucket4+Buckets)

At this point, I have the one extra event at the end of my dataset which contains the values of MaxOrders, Buckets, Bucket1, Bucket2, Bucket3, Bucket4, and Bucket5.

After the buckets are created, I need to compare Orders (for each user) against the buckets to assign the user their code, which requires an eval if statement:

... | eval Code=if(Orders<=Bucket1, 1, if(Orders<=Bucket2, 2, if(Orders<=Bucket3, 3, if(Orders<=Bucket4, 4, 5))))

But since the values for Bucket1, Bucket2, Bucket3, Bucket4, and Bucket5 only exist in that one event (not in all the events), I get 5 for all the users' Code value.

Is there a way to get the values for Bucket1, Bucket2, Bucket3, Bucket4, and Bucket5 to exist in all the records without having to do a join/append to rerun the calculation for Orders all over again?

Tags (1)
0 Karma
1 Solution

kristian_kolb
Ultra Champion

Instead of stats, did you try eventstats? That will let you have the best of both worlds - both the statistics and the original event. Then perhaps that can simplify your further calculations.

... | eventstats max(orders) as MaxOrders | ...

http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Eventstats

On a side note, perhaps the case() function for eval is neater than the nested ífs.

/K

View solution in original post

kristian_kolb
Ultra Champion

Instead of stats, did you try eventstats? That will let you have the best of both worlds - both the statistics and the original event. Then perhaps that can simplify your further calculations.

... | eventstats max(orders) as MaxOrders | ...

http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Eventstats

On a side note, perhaps the case() function for eval is neater than the nested ífs.

/K

redc
Builder

Oh, wow...I've never used eventstats outside of a Splunk class and I had kind of forgotten about it.

That worked PERFECTLY! Thanks!

You may be right about the nested ifs vs. case() function...another one I haven't used.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Day 0

Hello Splunk Community! My name is Chris, and I'm based in Canberra, Australia's capital, and I travelled for ...

Enhance Security Visibility with Splunk Enterprise Security 7.1 through Threat ...

 (view in My Videos)Struggling with alert fatigue, lack of context, and prioritization around security ...

Troubleshooting the OpenTelemetry Collector

  In this tech talk, you’ll learn how to troubleshoot the OpenTelemetry collector - from checking the ...