Splunk Search

How to improve search performance with join alternative? Query frequently times out due to subsearch time limit

JackNY07
Explorer

I have a query that frequently times out due to the subsearch time limit. I'd like to improve it's performance but I'm not sure how. Here's my query:

host=prod* source=user-activity.log sourcetype=log4j ID=uniqueID MESSAGE="LOGIN_SUCCESS*"| stats count as Logins by Full_Date, ID, DName, STATE | join type=left ID [ search host=prod* source=server.log  sourcetype=log4j MESSAGE="[Dashboard User-Facing*" ID=uniqueID | stats count as Errors by Full_Date ,ID, DName, STATE ]|eval %=round((100*Errors)/Logins,0) |table ID, DName, Full_Date, STATE, Errors, Logins,%

 

Any help would be greatly appreciated.

Labels (1)
0 Karma
1 Solution

bowesmana
SplunkTrust
SplunkTrust

It's quite straightforward to get this into a single search, by searching data_set_1 OR data_set_2 then evaluating the data set in the stats command like this

 

host=prod* sourcetype=log4j (source=user-activity.log ID=uniqueID MESSAGE="LOGIN_SUCCESS*") OR (source=server.log MESSAGE="[Dashboard User-Facing*" ID=uniqueID)
| stats sum(eval(if(MESSAGE="LOGIN_SUCCESS", 1, 0))) as Logins sum(eval(if(source=server.log, 1, 0))) as Errors by Full_Date, ID, DName, STATE 
| eval %=round((100*Errors)/Logins,0) 
| table ID, DName, Full_Date, STATE, Errors, Logins,%

 

In the stats/eval you are checking for the message login success to indicate a login data event and then the source for the error data set.

 

View solution in original post

bowesmana
SplunkTrust
SplunkTrust

It's quite straightforward to get this into a single search, by searching data_set_1 OR data_set_2 then evaluating the data set in the stats command like this

 

host=prod* sourcetype=log4j (source=user-activity.log ID=uniqueID MESSAGE="LOGIN_SUCCESS*") OR (source=server.log MESSAGE="[Dashboard User-Facing*" ID=uniqueID)
| stats sum(eval(if(MESSAGE="LOGIN_SUCCESS", 1, 0))) as Logins sum(eval(if(source=server.log, 1, 0))) as Errors by Full_Date, ID, DName, STATE 
| eval %=round((100*Errors)/Logins,0) 
| table ID, DName, Full_Date, STATE, Errors, Logins,%

 

In the stats/eval you are checking for the message login success to indicate a login data event and then the source for the error data set.

 

JackNY07
Explorer

@bowesmana Thank you!

Quick question though. How would I dedup just 1 of the 2 searches by SESSID?  stats dc? I had previously used  eventstats dc(SESSID) as Errors but can't seem to work it into the new search.

0 Karma

bowesmana
SplunkTrust
SplunkTrust

Depends on what that dedup is supposed to be doing. Is SESSID a unique session ID that is found in errors and are you trying to count unique sessions with errors?

Do both data sets have a SESSID field?

If you want to get unique sessions for the error data set, the something like this in the existing stats command

dc(eval(if(source="server.log", SESSID, null()))) as SessionsWithErrors

 which is saying if it is the errors data set (source=server.log) then return the SESSID from that event otherwise return null. dc will then count unique sessions.

 

Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...

Design, Compete, Win: Submit Your Best Splunk Dashboards for a .conf26 Pass

Hello Splunkers,  We’re excited to kick off a Splunk Dashboard contest! We know that dashboards are a primary ...

May 2026 Splunk Expert Sessions: Security & Observability

Level Up Your Operations: May 2026 Splunk Expert Sessions Whether you are refining your security posture or ...