Splunk Search

SPL search pattern efficiency: join vs eventstats

elliotproebstel
Champion

We have a Splunk app that was developed in-house to track indicators that are submitted to a blocklist. Here's a simplified version of the workflow:

  1. Analyst submits indicator to be blocked/unblocked/whitelisted. An event like this is logged: index="blocklist" user="jsmith" indicator="badguy@test.com" source="threatfeed" status="submitted" action="block". (Additional, sensitive fields are present in real events.)
  2. Lead analyst reviews submissions and approves or rejects each indicator. An event like this is logged: index="blocklist" user="dtownsend" indicator="badguy@test.com" source="threatfeed" status="approved" action="block"
  3. Lead analyst complies newly-approved indicators and sends them to an operations team for implementation. An event like this is logged: index="blocklist" user="dtownsend" indicator="badguy@test.com" source="threatfeed" status="distributed" action="block"

I am trying to revise the queries that populate some of that dashboards that analysts use to interact with the blocklist data, and I'd like some guidance on search patterns. I've been running local tests on the various approaches, but the results aren't as conclusive as I'd like.

Approval/Rejection Dashboard
This page should display all indicators that have been submitted in the last seven days and have not yet been approved. The engineer who built this app used the following query structure to populate the dashboard:

index="blocklist" status="submitted"
| join type=left indicator action source 
[ index="blocklist" (status="approved" OR status="rejected")
  | eval has_been_reviewed="true" ]
| search NOT (has_been_reviewed="true")

I've learned to be wary anytime I see join, and I understand that negative searches (i.e. searches using NOT) are less efficient than positive searches. So I was planning to revise the above into this:

index="blocklist" (status="submitted" OR status="approved" OR status="rejected")
| eventstats dc(status) AS status_count values(status) AS status BY action indicator
| search status_count=1 status="submitted"

However, I wanted to first ask - is eventstats more efficient? Or is there an even better pattern I could be using for this search? Thanks!

0 Karma
1 Solution

DalJeanis
SplunkTrust
SplunkTrust

Yes, this should be much better than the join.

I'd tend to do it this way, which is pretty much equivalent to yours performance-wise...

index="blocklist"  (status="submitted" OR status="approved" OR status="rejected")
| eventstats max(eval(case(status="rejected" OR status="approved","Yes"))) as decisioned 
     BY action indicator
| where status="submitted" AND isnull(decisioned)

updated to use where.

View solution in original post

0 Karma

DalJeanis
SplunkTrust
SplunkTrust

Yes, this should be much better than the join.

I'd tend to do it this way, which is pretty much equivalent to yours performance-wise...

index="blocklist"  (status="submitted" OR status="approved" OR status="rejected")
| eventstats max(eval(case(status="rejected" OR status="approved","Yes"))) as decisioned 
     BY action indicator
| where status="submitted" AND isnull(decisioned)

updated to use where.

View solution in original post

0 Karma
Take the 2021 Splunk Career Survey

Help us learn about how Splunk has
impacted your career by taking the 2021 Splunk Career Survey.

Earn $50 in Amazon cash!