Solved: stats based on all events and filtered

zineer · ‎10-31-2014

This is probably simpler than I'm thinking on a Friday morning, but with my limited Splunk experience I'm having issues coming up with a solution.

We have events with fields id, date, special_price and total_price. special_price may be null or 0 or have an integer value.

We currently have a search which gets the number of events that have a value for special_price and of those, calculates the percentage that special_price is of total_price:
source=order_log special_price > 0 | timechart span=1d count(id) AS numSpecialOrders, avg(eval((special_price/total_price)*100)) AS percOfTotalPrice

We want to switch this to rather than just getting the count of specialOrders (where special_price > 0) per day, we want to get the percentage of total events which have a special_price > 0, and of those, calculate the percOfTotalPrice.

So if there are 5000 events today and 500 have a value > 0 for special_price, and of those 500, the average percOfTotalPrice is 20%, we want: 2014-10-31, 10, 20

Thanks!

aholzer · ‎10-31-2014

If I understand correctly, the only thing you are missing to be able to calculate the percentage of events that are special orders, is the total number of events.

I suggest you use appendcols to get that number and then an eval to do the calculation with it afterwards. Like so (where "base search" is your original search from your question):

<base search> | appendcols [search source=order_log | timechart span=1d count as TotalNumOrders] | eval percOfSpecialOrders = (numSpecialOrders/TotalNumOrders) * 100 | table _time, percOfSpecialOrders, percOfTotalPrice

Like stated above. The appendcols will calcualte the daily total number of orders (not just special orders) and append it to each appropriate day from your original search. So your results should look like:

_time,numSpecialOrders,percOfTotalPrice,TotalNumOrders
2014-10-31, 500, 20, 5000

Then you pipe that into the eval to calculate the percentage that are special orders, getting a new column. Looking like this:

_time,numSpecialOrders,percOfTotalPrice,TotalNumOrders,percOfSpecialOrders 
2014-10-31, 500, 20, 5000, 10

Finally, since you only care about three fields, you call the table command and list out the fields in the proper order, and voila you have what you wanted.

Hope this helps.

View solution in original post

aholzer · ‎10-31-2014

If I understand correctly, the only thing you are missing to be able to calculate the percentage of events that are special orders, is the total number of events.

I suggest you use appendcols to get that number and then an eval to do the calculation with it afterwards. Like so (where "base search" is your original search from your question):

<base search> | appendcols [search source=order_log | timechart span=1d count as TotalNumOrders] | eval percOfSpecialOrders = (numSpecialOrders/TotalNumOrders) * 100 | table _time, percOfSpecialOrders, percOfTotalPrice

Like stated above. The appendcols will calcualte the daily total number of orders (not just special orders) and append it to each appropriate day from your original search. So your results should look like:

_time,numSpecialOrders,percOfTotalPrice,TotalNumOrders
2014-10-31, 500, 20, 5000

Then you pipe that into the eval to calculate the percentage that are special orders, getting a new column. Looking like this:

_time,numSpecialOrders,percOfTotalPrice,TotalNumOrders,percOfSpecialOrders 
2014-10-31, 500, 20, 5000, 10

Finally, since you only care about three fields, you call the table command and list out the fields in the proper order, and voila you have what you wanted.

Hope this helps.

gkanapathy · ‎10-31-2014

This works, but I might suggest the use of the eventstats command rather than appendcols with a subsearch. That is probably a little more efficient, i.e., replace the | appendcols ... | ... with | bucket _time span=1d | eventstats count as TotalNumOrders by _time | ...

zineer · ‎10-31-2014

gkanapathy, that doesn't seem to work for me. the base search is already returning 1 row per day, so TotalNumOrders is always equal 1. This is what I had with your suggestion source=order_log special_price > 0 | timechart span=1d count(id) AS numSpecialOrders, avg(eval((special_price)*100)) AS percOfTotalPrice | bucket _time span=1d | eventstats count as TotalNumOrders by _time | eval percSpecialOrders = (numSpecialOrders/TotalNumOrders) * 100 | table _time, percSpecialOrders, percOfTotalPrice

zineer · ‎10-31-2014

Brilliant!! Thanks so much! Works as expected.

stats based on all events and filtered

Splunk Enterprise Security(ES) 7.3 is approaching the end of support. Get ready for ...

Splunk Enterprise Security 8.x: The Essential Upgrade for Threat Detection, ...

Splunk Observability for AI

Are you a member of the Splunk Community?

stats based on all events and filtered

Splunk Enterprise Security(ES) 7.3 is approaching the end of support. Get ready for ...

Splunk Enterprise Security 8.x: The Essential Upgrade for Threat Detection, ...

Splunk Observability for AI