Solved: Assigning a max value from one field as a new fiel...

redc · ‎12-18-2013

I am attempting to write a search that creates arbitrary "buckets" for qualifying events using a numeric code (1-5). For this particular search, I'm calculating the number of orders that individuals placed during a specific window of time, then I need to take the maximum number of orders by ANY of the individuals, divide that by 5, and then assign a classification to all of the individuals based on how many orders they placed.

Example:

User Orders Code
   1      2    1
   2      5    3
   3     10    5
   4      4    2

Because of the format that the data is in, finding out how many orders each user has is really complicated and I really don't want to have to run it twice.

To get the largest number of orders in the dataset, I run:

... | stats max(Orders) as MaxOrders

Unfortunately, once I use a stats command, I can't go back and run calculations against the dataset without doing a join or append search that would run ALL the data through the calculations to find out how many orders they have all over again.

I've tried using appendpipe:

... | appendpipe [ stats max(Orders) as MaxOrders ]

This creates a whole new event where everything is null except the "MaxOrders" column. But then I need create buckets (something like this):

... | eval Buckets=(MaxOrders/5) | 
eval Bucket1=Buckets | 
eval Bucket2=(Bucket1+Buckets) | 
eval Bucket3=(Bucket2+Buckets) | 
eval Bucket4=(Bucket3+Buckets) | 
eval Bucket5=(Bucket4+Buckets)

At this point, I have the one extra event at the end of my dataset which contains the values of MaxOrders, Buckets, Bucket1, Bucket2, Bucket3, Bucket4, and Bucket5.

After the buckets are created, I need to compare Orders (for each user) against the buckets to assign the user their code, which requires an eval if statement:

... | eval Code=if(Orders<=Bucket1, 1, if(Orders<=Bucket2, 2, if(Orders<=Bucket3, 3, if(Orders<=Bucket4, 4, 5))))

But since the values for Bucket1, Bucket2, Bucket3, Bucket4, and Bucket5 only exist in that one event (not in all the events), I get 5 for all the users' Code value.

Is there a way to get the values for Bucket1, Bucket2, Bucket3, Bucket4, and Bucket5 to exist in all the records without having to do a join/append to rerun the calculation for Orders all over again?

kristian_kolb · ‎12-18-2013

Instead of stats, did you try eventstats? That will let you have the best of both worlds - both the statistics and the original event. Then perhaps that can simplify your further calculations.

... | eventstats max(orders) as MaxOrders | ...

http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Eventstats

On a side note, perhaps the case() function for eval is neater than the nested ífs.

/K

View solution in original post

kristian_kolb · ‎12-18-2013

Instead of stats, did you try eventstats? That will let you have the best of both worlds - both the statistics and the original event. Then perhaps that can simplify your further calculations.

... | eventstats max(orders) as MaxOrders | ...

http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Eventstats

On a side note, perhaps the case() function for eval is neater than the nested ífs.

/K

redc · ‎12-18-2013

Oh, wow...I've never used eventstats outside of a Splunk class and I had kind of forgotten about it.

That worked PERFECTLY! Thanks!

You may be right about the nested ifs vs. case() function...another one I haven't used.

Assigning a max value from one field as a new field

Splunk Observability for AI

Splunk Enterprise Security 8.x: The Essential Upgrade for Threat Detection, ...

Splunk Observability as Code: From Zero to Dashboard

Are you a member of the Splunk Community?

Assigning a max value from one field as a new field

Splunk Observability for AI

Splunk Enterprise Security 8.x: The Essential Upgrade for Threat Detection, ...

Splunk Observability as Code: From Zero to Dashboard