Splunk Search

SPL search pattern efficiency: join vs eventstats

elliotproebstel
Champion

We have a Splunk app that was developed in-house to track indicators that are submitted to a blocklist. Here's a simplified version of the workflow:

  1. Analyst submits indicator to be blocked/unblocked/whitelisted. An event like this is logged: index="blocklist" user="jsmith" indicator="badguy@test.com" source="threatfeed" status="submitted" action="block". (Additional, sensitive fields are present in real events.)
  2. Lead analyst reviews submissions and approves or rejects each indicator. An event like this is logged: index="blocklist" user="dtownsend" indicator="badguy@test.com" source="threatfeed" status="approved" action="block"
  3. Lead analyst complies newly-approved indicators and sends them to an operations team for implementation. An event like this is logged: index="blocklist" user="dtownsend" indicator="badguy@test.com" source="threatfeed" status="distributed" action="block"

I am trying to revise the queries that populate some of that dashboards that analysts use to interact with the blocklist data, and I'd like some guidance on search patterns. I've been running local tests on the various approaches, but the results aren't as conclusive as I'd like.

Approval/Rejection Dashboard
This page should display all indicators that have been submitted in the last seven days and have not yet been approved. The engineer who built this app used the following query structure to populate the dashboard:

index="blocklist" status="submitted"
| join type=left indicator action source 
[ index="blocklist" (status="approved" OR status="rejected")
  | eval has_been_reviewed="true" ]
| search NOT (has_been_reviewed="true")

I've learned to be wary anytime I see join, and I understand that negative searches (i.e. searches using NOT) are less efficient than positive searches. So I was planning to revise the above into this:

index="blocklist" (status="submitted" OR status="approved" OR status="rejected")
| eventstats dc(status) AS status_count values(status) AS status BY action indicator
| search status_count=1 status="submitted"

However, I wanted to first ask - is eventstats more efficient? Or is there an even better pattern I could be using for this search? Thanks!

0 Karma
1 Solution

DalJeanis
Legend

Yes, this should be much better than the join.

I'd tend to do it this way, which is pretty much equivalent to yours performance-wise...

index="blocklist"  (status="submitted" OR status="approved" OR status="rejected")
| eventstats max(eval(case(status="rejected" OR status="approved","Yes"))) as decisioned 
     BY action indicator
| where status="submitted" AND isnull(decisioned)

updated to use where.

View solution in original post

0 Karma

DalJeanis
Legend

Yes, this should be much better than the join.

I'd tend to do it this way, which is pretty much equivalent to yours performance-wise...

index="blocklist"  (status="submitted" OR status="approved" OR status="rejected")
| eventstats max(eval(case(status="rejected" OR status="approved","Yes"))) as decisioned 
     BY action indicator
| where status="submitted" AND isnull(decisioned)

updated to use where.

0 Karma
Get Updates on the Splunk Community!

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...

.conf24 | Learning Tracks for Security, Observability, Platform, and Developers!

.conf24 is taking place at The Venetian in Las Vegas from June 11 - 14. Continue reading to learn about the ...

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...