Splunk Search

How to improve search performance with join alternative? Query frequently times out due to subsearch time limit

JackNY07
Explorer

I have a query that frequently times out due to the subsearch time limit. I'd like to improve it's performance but I'm not sure how. Here's my query:

host=prod* source=user-activity.log sourcetype=log4j ID=uniqueID MESSAGE="LOGIN_SUCCESS*"| stats count as Logins by Full_Date, ID, DName, STATE | join type=left ID [ search host=prod* source=server.log  sourcetype=log4j MESSAGE="[Dashboard User-Facing*" ID=uniqueID | stats count as Errors by Full_Date ,ID, DName, STATE ]|eval %=round((100*Errors)/Logins,0) |table ID, DName, Full_Date, STATE, Errors, Logins,%

 

Any help would be greatly appreciated.

Labels (1)
0 Karma
1 Solution

bowesmana
SplunkTrust
SplunkTrust

It's quite straightforward to get this into a single search, by searching data_set_1 OR data_set_2 then evaluating the data set in the stats command like this

 

host=prod* sourcetype=log4j (source=user-activity.log ID=uniqueID MESSAGE="LOGIN_SUCCESS*") OR (source=server.log MESSAGE="[Dashboard User-Facing*" ID=uniqueID)
| stats sum(eval(if(MESSAGE="LOGIN_SUCCESS", 1, 0))) as Logins sum(eval(if(source=server.log, 1, 0))) as Errors by Full_Date, ID, DName, STATE 
| eval %=round((100*Errors)/Logins,0) 
| table ID, DName, Full_Date, STATE, Errors, Logins,%

 

In the stats/eval you are checking for the message login success to indicate a login data event and then the source for the error data set.

 

View solution in original post

bowesmana
SplunkTrust
SplunkTrust

It's quite straightforward to get this into a single search, by searching data_set_1 OR data_set_2 then evaluating the data set in the stats command like this

 

host=prod* sourcetype=log4j (source=user-activity.log ID=uniqueID MESSAGE="LOGIN_SUCCESS*") OR (source=server.log MESSAGE="[Dashboard User-Facing*" ID=uniqueID)
| stats sum(eval(if(MESSAGE="LOGIN_SUCCESS", 1, 0))) as Logins sum(eval(if(source=server.log, 1, 0))) as Errors by Full_Date, ID, DName, STATE 
| eval %=round((100*Errors)/Logins,0) 
| table ID, DName, Full_Date, STATE, Errors, Logins,%

 

In the stats/eval you are checking for the message login success to indicate a login data event and then the source for the error data set.

 

JackNY07
Explorer

@bowesmana Thank you!

Quick question though. How would I dedup just 1 of the 2 searches by SESSID?  stats dc? I had previously used  eventstats dc(SESSID) as Errors but can't seem to work it into the new search.

0 Karma

bowesmana
SplunkTrust
SplunkTrust

Depends on what that dedup is supposed to be doing. Is SESSID a unique session ID that is found in errors and are you trying to count unique sessions with errors?

Do both data sets have a SESSID field?

If you want to get unique sessions for the error data set, the something like this in the existing stats command

dc(eval(if(source="server.log", SESSID, null()))) as SessionsWithErrors

 which is saying if it is the errors data set (source=server.log) then return the SESSID from that event otherwise return null. dc will then count unique sessions.

 

Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...