Splunk Search

showperc and usother for stats/charts?

jankowsr
Path Finder

I'm wondering if there is any option to have a showperc and useother functionality in stats/charts? They seem to be available only in top command which (if I'm correct) can only aggregate by count of events.
So if I want to print simple statistics for top N talkers showing for example:

Source_IP transferred_bytes perc_of_transferred_bytes number_of_tcp_flows perc_of_tcp_flows
1:
2:
3:
.....
10:
Other:

then it becomes quite complex query.

The solution for calculating percents has been provided here: https://answers.splunk.com/answers/9660/stats-count-as-a-percentage-as-the-total.html

The question for calculating "Other" statistics has been raised here:
https://answers.splunk.com/answers/68744/stats-limiting-to-top-results-and-summarizing-the-rest.html
but not answered so far. If found a way of calculating "others" in another query and merging it with the usage of the appendpipe command.

Anyway as a result of all such combinations I have a huge, complex and hard to maintain query. showperc and useother options for stats/charts would make life so much easier. Any ideas?

0 Karma
1 Solution

lguinn2
Legend

I shortened the field names to make it easier to type, but I think this solution should work.

yoursearchhere
| stats sum(bytes) as xBytes, count as numFlows by src_ip
| eventstats sum(xBytes) as totalBytes, sum(numFlows) as totalFlows
| sort 10 -xBytes -numFlows
| appendpipe [ stats sum(xBytes) as topBytes, last(totalBytes) as totalBytes,
                     sum(numFlows) as topFlows, last(totalFlows) as totalFlows 
    | eval src_ip = "Other" 
    | eval xBytes = totalBytes - topBytes 
    | eval numFlows = totalFlows - topFlows ]
| where xBytes > 0
| eval percentBytes = round(xBytes * 100 / totalBytes, 1)
| eval percentFlows = round(numFlows * 100 / totalFlows, 1)
| table src_ip xBytes percentBytes numFlows percentFlows

This doesn't require subsearches. To change from a "top 10" to a "top 20," simply change the sort command. I did make it a little fancy, so that it drops the "Other" line if there are no "others" - for example, if you asked for a top 10, but there were only 9 source ips in the timerange.

View solution in original post

lguinn2
Legend

I shortened the field names to make it easier to type, but I think this solution should work.

yoursearchhere
| stats sum(bytes) as xBytes, count as numFlows by src_ip
| eventstats sum(xBytes) as totalBytes, sum(numFlows) as totalFlows
| sort 10 -xBytes -numFlows
| appendpipe [ stats sum(xBytes) as topBytes, last(totalBytes) as totalBytes,
                     sum(numFlows) as topFlows, last(totalFlows) as totalFlows 
    | eval src_ip = "Other" 
    | eval xBytes = totalBytes - topBytes 
    | eval numFlows = totalFlows - topFlows ]
| where xBytes > 0
| eval percentBytes = round(xBytes * 100 / totalBytes, 1)
| eval percentFlows = round(numFlows * 100 / totalFlows, 1)
| table src_ip xBytes percentBytes numFlows percentFlows

This doesn't require subsearches. To change from a "top 10" to a "top 20," simply change the sort command. I did make it a little fancy, so that it drops the "Other" line if there are no "others" - for example, if you asked for a top 10, but there were only 9 source ips in the timerange.

jankowsr
Path Finder

Thank you Iguinn, I hoped there will be something simpler that comes with stats/chart, so adding or changing column will not be necessary in 6 places. Anyway I still find it quite smart query

0 Karma

lguinn2
Legend

BTW, there is a "limit=N" option for the chart command, but there is no "useother" option.

A big problem with calculating the "top values" for a stats/chart is that there could be many values/sums/counts/fields in the results - which ones do you use and how do you calculate "top"? The top command computes a single counter, so it doesn't have this problem. I think this is why such an option doesn't exist today.

In your example, the bytes and the number of flows require two different initial calculations. This can probably never be simply resolved by an option.

0 Karma

jankowsr
Path Finder

Thank you for your response, Iguinn.
I'm fully aware of the fact that decision has to be made which column is used for sorting and selecting top values. Anyway since such decision is made having such stats with multiple aggregated columns out of the box and with the "others" row present would be to me a reasonable feature.
I don't want to propose syntax for it but I could imagine it wouldn't be something impossible.
I have just provided my tiny piece of feedback, obviously decision for the future is yours ;-).

0 Karma

lguinn2
Legend

Oh no, believe me, it's not my decision at all! I have much less impact on feature decisions than customers. And that is really how things should be.

Although I have been thinking about whether I could write a custom search command to do this, and how it might work... 🙂

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Unlock What’s Next: The Splunk Cloud Platform at .conf25

In just a few days, Boston will be buzzing as the Splunk team and thousands of community members come together ...