Splunk Search

How do I count each value of a multi-value field and show the top 30 with percentage?

raykongstar
Explorer

Dear Community,

So far, I have gone through the posted QnAs, but haven't yet found a way to make it work with my data context and desired output.

I'm analyzing ASA firewall logs. I'm trying to make a search for the top 30 inbound ports for a range of IPs.

FIELDS:
Destination IP = dest_ip 
Destination Port = dest_port
Protocol = protocol

The closest I could get was as below;

dest_ip="10.66.100.*" | stats value(dest_port) count by dest_ip

The above gave me a list of IPs and ports but the total count was seemingly for all ports combined.

DESIRED OUTPUT:
IP # List of Ports # Hits count for each port # Protocol # Percentage of total traffic

Search results for ports belonging to the same IP address should be grouped together. So IP address followed but multi-value field of ports with corresponding count, protocol and percentage for each.


We can also try this on the buttercupgames data so that we all get same output format;

Fields for buttercupgames:
dest_port = clientip
dest_ip= productId

Let's use just these two fields above, forget protocol & port.

 Buttercupgames | stats count by productId,clientip | eventstats sum(count) as total by productId
 | eval percentage=round((count/total)*100,2)
     | stats list(*) as * by productId | sort -total | head 30 | fields - total

I hope you already have buttercupgames data or know how to get it. I'm not allowed to post links yet. It's Splunk's own sample data.

Regards!

0 Karma
1 Solution

renjith_nair
Legend

@raykongstar,

Try

  dest_ip="10.66.100.*" | stats count by dest_port,port,protocol,dest_ip|eventstats sum(count) as _total by dest_ip
  |eval percentage=round((count/_total)*100,2)
  |stats list(*) as * by dest_ip

Updated:

  dest_ip="10.66.100.*" | stats count by dest_port,port,protocol,dest_ip|eventstats sum(count) as _total by dest_ip
  |eval percentage=round((count/_total)*100,2)
  |stats list(*) as * by dest_ip
---
What goes around comes around. If it helps, hit it with Karma 🙂

View solution in original post

renjith_nair
Legend

@raykongstar,

Try

  dest_ip="10.66.100.*" | stats count by dest_port,port,protocol,dest_ip|eventstats sum(count) as _total by dest_ip
  |eval percentage=round((count/_total)*100,2)
  |stats list(*) as * by dest_ip

Updated:

  dest_ip="10.66.100.*" | stats count by dest_port,port,protocol,dest_ip|eventstats sum(count) as _total by dest_ip
  |eval percentage=round((count/_total)*100,2)
  |stats list(*) as * by dest_ip
---
What goes around comes around. If it helps, hit it with Karma 🙂

renjith_nair
Legend

@raykongstar, I am using data from internal index to simulate your scenario.

Here I consider index in place of your dest_ip and sourcetype in place of your port.
Please have a look at the attached image and let me know where you see the issue

In step1 - It calculates the count based on the index,sourcetype(dest_ip,dest_port) and then total is calculated per index(dest_ip). Percentage is calculated as (count/total)*100
In step2 - It brings everything into a list for the formatting but data remains the same (ignore slight variation of count as its live data)
In step3 - We sort it based on total and take the first value which is index=_internal in this cases since it has the highest total.
alt text

---
What goes around comes around. If it helps, hit it with Karma 🙂

raykongstar
Explorer

Thanks this worked perfectly. I'll play around with the Top/head command ad it's still not giving me the top results I need.. Thanks Again, This is usable for me.

0 Karma

raykongstar
Explorer

The updated version of your search gave me exactly what I needed except the formula for percentage is still not right.

But. So far so good, I need to use this information to help determine what ports to open on the firewall.

Only the following are missing now;

  1. Percentage not calculated right.
  2. BONUS: Top 30/20/10 values.
0 Karma

renjith_nair
Legend

@raykongstar,
1. The percentage calculated as what percentage count per port is of Total count for that IP ((count/total)*100). Let me know what's your formula for percentage and we can change.
2. try

   dest_ip="10.66.100.*" | stats count by dest_port,port,protocol,dest_ip|eventstats sum(count) as total by dest_ip
   |eval percentage=round((count/total)*100,2)
   |stats list(*) as * by dest_ip|sort - total |head 30|fields - total
---
What goes around comes around. If it helps, hit it with Karma 🙂
0 Karma

raykongstar
Explorer

Thanks Renjith,

Unfortunately that didn't solve the issue(s).

  1. Percentage = (count/Summation of total counts PER dest_ip)*100,2) . It can be confusing, see my suggestion below about buttercupgames data.

  2. I still didn't get top 30 and the results weren't sorted.

I had an idea, why dont we all try this in the context of BUTTERGAMES data so that we be on the same page on the output;

For instance I tried below on buttercupgames but still didn't give expected results;

Fields for buttercupgames:
dest_port = clientip
dest_ip= productId

Let's use just these two fields above, forget protocol & port.

Buttercupgames | stats count by productId,clientip |eventstats sum(count) as total by productId
|eval percentage=round((count/total)*100,2)
    |stats list(*) as * by productId|sort -total |head 30|fields - total

I hope you already have buttercupgames data or know how to get it. I'm not allowed to post links yet. It's Splunk's own sample data.

0 Karma

raykongstar
Explorer

@renjith.nair You search was very close but unfortunately had the following issues or maybe I wasn't clear enough.

  1. The percentage is supposed to calculate only for a certain dest_ip, so should total to 100% for each IP address.
  2. The ports belonging to one IP address should be grouped together, not appear as different search results. ie. See below;

10.66.100.1 ..... 6543 ... 345 .... 40%
.... 443 ..... 453 ..... 50%
..... 80 ....... 70 ...... 10%

10.66.100.4 ..... 6543 ... 345 .... 30%
.... 443 ..... 453 ..... 50%
..... 80 ....... 70 ...... 20%

10.66.100.8 ..... 6543 ... 345 .... 20%
.... 443 ..... 453 ..... 20%
..... 80 ....... 70 ...... 10%
.... 443 ..... 453 ..... 30%
..... 80 ....... 70 ...... 20%

Etc etc etc.. I hope that's clearer now.

Thanks in advance.

0 Karma

raykongstar
Explorer

I see spaces have been removed.. just imagine there next line after IP address is indented.

0 Karma

renjith_nair
Legend

@raykongstar,
Try this

 dest_ip="10.66.100.*" | stats count by dest_port,port,protocol,dest_ip|eventstats sum(count) as _total by dest_ip
 |eval percentage=round((count/_total)*100,2)
 |stats list(*) as * by dest_ip
---
What goes around comes around. If it helps, hit it with Karma 🙂
0 Karma
Get Updates on the Splunk Community!

New in Observability - Improvements to Custom Metrics SLOs, Log Observer Connect & ...

The latest enhancements to the Splunk observability portfolio deliver improved SLO management accuracy, better ...

Improve Data Pipelines Using Splunk Data Management

  Register Now   This Tech Talk will explore the pipeline management offerings Edge Processor and Ingest ...

3-2-1 Go! How Fast Can You Debug Microservices with Observability Cloud?

Register Join this Tech Talk to learn how unique features like Service Centric Views, Tag Spotlight, and ...