Solved: How to find the top ten categories causing bandwid...

Ida_2017 · ‎11-11-2019

Hi I am using this search in order to find out what Bluecoat filter categories cause the most bandwidth utilization

index=bluecoat mysearch | fields sc_filter_category sc_bytes  | eventstats sum(sc_bytes) as allbytes  |  stats sum(sc_bytes) as "totalbytes"  by sc_filter_category,allbytes | eval "Bandwidth(MB)"= round(totalbytes/(1024*1024),2) |  eval Percentage=(totalbytes/allbytes)*100  | sort 10 -"Bandwidth(MB)"

Tis seems to work fine.

My result is as an example a table like this

sc_filter_category  allbytes, Bandwidth(MB), Percentage
category1, 100, 20,20
category2,100, 11,11
category3,100,10,10
category4,100,5,5

So what I would like to do is then in a second search be able to list what top two URLs cause the most bandwidth for each category.

The output would look like this

Category    Top-URLs
category1   www.abc.com, www.def.com, www.ghi.com
category2  www.abc1.com, www.def1.com, www.ghi1.com

I am not able to find out how to search dynamically using the result of the first search... any help appreciated.

FrankVl · ‎11-11-2019

In general, searching based on the result of another search can be done using subsearches: https://docs.splunk.com/Documentation/Splunk/latest/Search/Aboutsubsearches

But that might not be the most efficient solution here. Assuming the url is in the url field (adjust accordingly if it isn't), something like this might work:

index=bluecoat mysearch
| fields sc_filter_category sc_bytes url
| stats sum(sc_bytes) as "urlbytes" by url,sc_filter_category
| eventstats sum(urlbytes) as totalbytes by sc_filter_category
| sort - totalbytes,urlbytes
| stats values(totalbytes) as totalbytes list(url) as urls by sc_filter_category
| eval urls = mvjoin(mvindex(urls,0,2),", ")
| eventstats sum(totalbytes) as allbytes
| eval "Bandwidth(MB)"= round(totalbytes/(1024*1024),2)
| eval Percentage=(totalbytes/allbytes)*100
| sort 10 -"Bandwidth(MB)"

This calculates the bytes per url and category, then sorts on those 2 fields, so you get the categories sorted by totalbytes and within that the urls sorted by urlbytes. It then uses stats to get the urls into a multivalue field (which retains the sorting done just before) and then takes only the first 3 urls per category and puts them behind eachother using mvjoin(mvindex(urls,0,2),", "). Then allbytes and other calculations are done like you had them before.

Note: this also assumes URLs are 1:1 related to categories.

View solution in original post

Ida_2017 · ‎11-11-2019

Hi Frank

that definitely works for me. I was working according to the examples in the link for subsearches, but as I am not looking for the events with the most counts, but for the calculated bandwidth, I never got it to work.

Thanks for your quick and efficient answer!

FrankVl · ‎11-11-2019

In general, searching based on the result of another search can be done using subsearches: https://docs.splunk.com/Documentation/Splunk/latest/Search/Aboutsubsearches

But that might not be the most efficient solution here. Assuming the url is in the url field (adjust accordingly if it isn't), something like this might work:

index=bluecoat mysearch
| fields sc_filter_category sc_bytes url
| stats sum(sc_bytes) as "urlbytes" by url,sc_filter_category
| eventstats sum(urlbytes) as totalbytes by sc_filter_category
| sort - totalbytes,urlbytes
| stats values(totalbytes) as totalbytes list(url) as urls by sc_filter_category
| eval urls = mvjoin(mvindex(urls,0,2),", ")
| eventstats sum(totalbytes) as allbytes
| eval "Bandwidth(MB)"= round(totalbytes/(1024*1024),2)
| eval Percentage=(totalbytes/allbytes)*100
| sort 10 -"Bandwidth(MB)"

This calculates the bytes per url and category, then sorts on those 2 fields, so you get the categories sorted by totalbytes and within that the urls sorted by urlbytes. It then uses stats to get the urls into a multivalue field (which retains the sorting done just before) and then takes only the first 3 urls per category and puts them behind eachother using mvjoin(mvindex(urls,0,2),", "). Then allbytes and other calculations are done like you had them before.

Note: this also assumes URLs are 1:1 related to categories.

How to find the top ten categories causing bandwidth and then using this search to find the top three URLs for these categories?

Announcing Scheduled Export GA for Dashboard Studio

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!