Splunk Search

How do I count each value of a multi-value field and show the top 30 with percentage?

raykongstar
Explorer

Dear Community,

So far, I have gone through the posted QnAs, but haven't yet found a way to make it work with my data context and desired output.

I'm analyzing ASA firewall logs. I'm trying to make a search for the top 30 inbound ports for a range of IPs.

FIELDS:
Destination IP = dest_ip 
Destination Port = dest_port
Protocol = protocol

The closest I could get was as below;

dest_ip="10.66.100.*" | stats value(dest_port) count by dest_ip

The above gave me a list of IPs and ports but the total count was seemingly for all ports combined.

DESIRED OUTPUT:
IP # List of Ports # Hits count for each port # Protocol # Percentage of total traffic

Search results for ports belonging to the same IP address should be grouped together. So IP address followed but multi-value field of ports with corresponding count, protocol and percentage for each.


We can also try this on the buttercupgames data so that we all get same output format;

Fields for buttercupgames:
dest_port = clientip
dest_ip= productId

Let's use just these two fields above, forget protocol & port.

 Buttercupgames | stats count by productId,clientip | eventstats sum(count) as total by productId
 | eval percentage=round((count/total)*100,2)
     | stats list(*) as * by productId | sort -total | head 30 | fields - total

I hope you already have buttercupgames data or know how to get it. I'm not allowed to post links yet. It's Splunk's own sample data.

Regards!

0 Karma
1 Solution

renjith_nair
Legend

@raykongstar,

Try

  dest_ip="10.66.100.*" | stats count by dest_port,port,protocol,dest_ip|eventstats sum(count) as _total by dest_ip
  |eval percentage=round((count/_total)*100,2)
  |stats list(*) as * by dest_ip

Updated:

  dest_ip="10.66.100.*" | stats count by dest_port,port,protocol,dest_ip|eventstats sum(count) as _total by dest_ip
  |eval percentage=round((count/_total)*100,2)
  |stats list(*) as * by dest_ip
---
What goes around comes around. If it helps, hit it with Karma 🙂

View solution in original post

renjith_nair
Legend

@raykongstar,

Try

  dest_ip="10.66.100.*" | stats count by dest_port,port,protocol,dest_ip|eventstats sum(count) as _total by dest_ip
  |eval percentage=round((count/_total)*100,2)
  |stats list(*) as * by dest_ip

Updated:

  dest_ip="10.66.100.*" | stats count by dest_port,port,protocol,dest_ip|eventstats sum(count) as _total by dest_ip
  |eval percentage=round((count/_total)*100,2)
  |stats list(*) as * by dest_ip
---
What goes around comes around. If it helps, hit it with Karma 🙂

renjith_nair
Legend

@raykongstar, I am using data from internal index to simulate your scenario.

Here I consider index in place of your dest_ip and sourcetype in place of your port.
Please have a look at the attached image and let me know where you see the issue

In step1 - It calculates the count based on the index,sourcetype(dest_ip,dest_port) and then total is calculated per index(dest_ip). Percentage is calculated as (count/total)*100
In step2 - It brings everything into a list for the formatting but data remains the same (ignore slight variation of count as its live data)
In step3 - We sort it based on total and take the first value which is index=_internal in this cases since it has the highest total.
alt text

---
What goes around comes around. If it helps, hit it with Karma 🙂

raykongstar
Explorer

Thanks this worked perfectly. I'll play around with the Top/head command ad it's still not giving me the top results I need.. Thanks Again, This is usable for me.

0 Karma

raykongstar
Explorer

The updated version of your search gave me exactly what I needed except the formula for percentage is still not right.

But. So far so good, I need to use this information to help determine what ports to open on the firewall.

Only the following are missing now;

  1. Percentage not calculated right.
  2. BONUS: Top 30/20/10 values.
0 Karma

renjith_nair
Legend

@raykongstar,
1. The percentage calculated as what percentage count per port is of Total count for that IP ((count/total)*100). Let me know what's your formula for percentage and we can change.
2. try

   dest_ip="10.66.100.*" | stats count by dest_port,port,protocol,dest_ip|eventstats sum(count) as total by dest_ip
   |eval percentage=round((count/total)*100,2)
   |stats list(*) as * by dest_ip|sort - total |head 30|fields - total
---
What goes around comes around. If it helps, hit it with Karma 🙂
0 Karma

raykongstar
Explorer

Thanks Renjith,

Unfortunately that didn't solve the issue(s).

  1. Percentage = (count/Summation of total counts PER dest_ip)*100,2) . It can be confusing, see my suggestion below about buttercupgames data.

  2. I still didn't get top 30 and the results weren't sorted.

I had an idea, why dont we all try this in the context of BUTTERGAMES data so that we be on the same page on the output;

For instance I tried below on buttercupgames but still didn't give expected results;

Fields for buttercupgames:
dest_port = clientip
dest_ip= productId

Let's use just these two fields above, forget protocol & port.

Buttercupgames | stats count by productId,clientip |eventstats sum(count) as total by productId
|eval percentage=round((count/total)*100,2)
    |stats list(*) as * by productId|sort -total |head 30|fields - total

I hope you already have buttercupgames data or know how to get it. I'm not allowed to post links yet. It's Splunk's own sample data.

0 Karma

raykongstar
Explorer

@renjith.nair You search was very close but unfortunately had the following issues or maybe I wasn't clear enough.

  1. The percentage is supposed to calculate only for a certain dest_ip, so should total to 100% for each IP address.
  2. The ports belonging to one IP address should be grouped together, not appear as different search results. ie. See below;

10.66.100.1 ..... 6543 ... 345 .... 40%
.... 443 ..... 453 ..... 50%
..... 80 ....... 70 ...... 10%

10.66.100.4 ..... 6543 ... 345 .... 30%
.... 443 ..... 453 ..... 50%
..... 80 ....... 70 ...... 20%

10.66.100.8 ..... 6543 ... 345 .... 20%
.... 443 ..... 453 ..... 20%
..... 80 ....... 70 ...... 10%
.... 443 ..... 453 ..... 30%
..... 80 ....... 70 ...... 20%

Etc etc etc.. I hope that's clearer now.

Thanks in advance.

0 Karma

raykongstar
Explorer

I see spaces have been removed.. just imagine there next line after IP address is indented.

0 Karma

renjith_nair
Legend

@raykongstar,
Try this

 dest_ip="10.66.100.*" | stats count by dest_port,port,protocol,dest_ip|eventstats sum(count) as _total by dest_ip
 |eval percentage=round((count/_total)*100,2)
 |stats list(*) as * by dest_ip
---
What goes around comes around. If it helps, hit it with Karma 🙂
0 Karma
Get Updates on the Splunk Community!

Get Schooled with Splunk Education: Explore Our Latest Courses

At Splunk Education, we’re dedicated to providing incredible learning experiences that cater to every skill ...

Splunk AI Assistant for SPL | Key Use Cases to Unlock the Power of SPL

Splunk AI Assistant for SPL | Key Use Cases to Unlock the Power of SPL  The Splunk AI Assistant for SPL ...

Buttercup Games: Further Dashboarding Techniques (Part 5)

This series of blogs assumes you have already completed the Splunk Enterprise Search Tutorial as it uses the ...