Splunk Search

SPL search pattern efficiency: join vs eventstats

elliotproebstel
Champion

We have a Splunk app that was developed in-house to track indicators that are submitted to a blocklist. Here's a simplified version of the workflow:

  1. Analyst submits indicator to be blocked/unblocked/whitelisted. An event like this is logged: index="blocklist" user="jsmith" indicator="badguy@test.com" source="threatfeed" status="submitted" action="block". (Additional, sensitive fields are present in real events.)
  2. Lead analyst reviews submissions and approves or rejects each indicator. An event like this is logged: index="blocklist" user="dtownsend" indicator="badguy@test.com" source="threatfeed" status="approved" action="block"
  3. Lead analyst complies newly-approved indicators and sends them to an operations team for implementation. An event like this is logged: index="blocklist" user="dtownsend" indicator="badguy@test.com" source="threatfeed" status="distributed" action="block"

I am trying to revise the queries that populate some of that dashboards that analysts use to interact with the blocklist data, and I'd like some guidance on search patterns. I've been running local tests on the various approaches, but the results aren't as conclusive as I'd like.

Approval/Rejection Dashboard
This page should display all indicators that have been submitted in the last seven days and have not yet been approved. The engineer who built this app used the following query structure to populate the dashboard:

index="blocklist" status="submitted"
| join type=left indicator action source 
[ index="blocklist" (status="approved" OR status="rejected")
  | eval has_been_reviewed="true" ]
| search NOT (has_been_reviewed="true")

I've learned to be wary anytime I see join, and I understand that negative searches (i.e. searches using NOT) are less efficient than positive searches. So I was planning to revise the above into this:

index="blocklist" (status="submitted" OR status="approved" OR status="rejected")
| eventstats dc(status) AS status_count values(status) AS status BY action indicator
| search status_count=1 status="submitted"

However, I wanted to first ask - is eventstats more efficient? Or is there an even better pattern I could be using for this search? Thanks!

0 Karma
1 Solution

DalJeanis
SplunkTrust
SplunkTrust

Yes, this should be much better than the join.

I'd tend to do it this way, which is pretty much equivalent to yours performance-wise...

index="blocklist"  (status="submitted" OR status="approved" OR status="rejected")
| eventstats max(eval(case(status="rejected" OR status="approved","Yes"))) as decisioned 
     BY action indicator
| where status="submitted" AND isnull(decisioned)

updated to use where.

View solution in original post

0 Karma

DalJeanis
SplunkTrust
SplunkTrust

Yes, this should be much better than the join.

I'd tend to do it this way, which is pretty much equivalent to yours performance-wise...

index="blocklist"  (status="submitted" OR status="approved" OR status="rejected")
| eventstats max(eval(case(status="rejected" OR status="approved","Yes"))) as decisioned 
     BY action indicator
| where status="submitted" AND isnull(decisioned)

updated to use where.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...