Splunk Search

How to improve search performance with join alternative? Query frequently times out due to subsearch time limit

JackNY07
Explorer

I have a query that frequently times out due to the subsearch time limit. I'd like to improve it's performance but I'm not sure how. Here's my query:

host=prod* source=user-activity.log sourcetype=log4j ID=uniqueID MESSAGE="LOGIN_SUCCESS*"| stats count as Logins by Full_Date, ID, DName, STATE | join type=left ID [ search host=prod* source=server.log  sourcetype=log4j MESSAGE="[Dashboard User-Facing*" ID=uniqueID | stats count as Errors by Full_Date ,ID, DName, STATE ]|eval %=round((100*Errors)/Logins,0) |table ID, DName, Full_Date, STATE, Errors, Logins,%

 

Any help would be greatly appreciated.

Labels (1)
0 Karma
1 Solution

bowesmana
SplunkTrust
SplunkTrust

It's quite straightforward to get this into a single search, by searching data_set_1 OR data_set_2 then evaluating the data set in the stats command like this

 

host=prod* sourcetype=log4j (source=user-activity.log ID=uniqueID MESSAGE="LOGIN_SUCCESS*") OR (source=server.log MESSAGE="[Dashboard User-Facing*" ID=uniqueID)
| stats sum(eval(if(MESSAGE="LOGIN_SUCCESS", 1, 0))) as Logins sum(eval(if(source=server.log, 1, 0))) as Errors by Full_Date, ID, DName, STATE 
| eval %=round((100*Errors)/Logins,0) 
| table ID, DName, Full_Date, STATE, Errors, Logins,%

 

In the stats/eval you are checking for the message login success to indicate a login data event and then the source for the error data set.

 

View solution in original post

bowesmana
SplunkTrust
SplunkTrust

It's quite straightforward to get this into a single search, by searching data_set_1 OR data_set_2 then evaluating the data set in the stats command like this

 

host=prod* sourcetype=log4j (source=user-activity.log ID=uniqueID MESSAGE="LOGIN_SUCCESS*") OR (source=server.log MESSAGE="[Dashboard User-Facing*" ID=uniqueID)
| stats sum(eval(if(MESSAGE="LOGIN_SUCCESS", 1, 0))) as Logins sum(eval(if(source=server.log, 1, 0))) as Errors by Full_Date, ID, DName, STATE 
| eval %=round((100*Errors)/Logins,0) 
| table ID, DName, Full_Date, STATE, Errors, Logins,%

 

In the stats/eval you are checking for the message login success to indicate a login data event and then the source for the error data set.

 

JackNY07
Explorer

@bowesmana Thank you!

Quick question though. How would I dedup just 1 of the 2 searches by SESSID?  stats dc? I had previously used  eventstats dc(SESSID) as Errors but can't seem to work it into the new search.

0 Karma

bowesmana
SplunkTrust
SplunkTrust

Depends on what that dedup is supposed to be doing. Is SESSID a unique session ID that is found in errors and are you trying to count unique sessions with errors?

Do both data sets have a SESSID field?

If you want to get unique sessions for the error data set, the something like this in the existing stats command

dc(eval(if(source="server.log", SESSID, null()))) as SessionsWithErrors

 which is saying if it is the errors data set (source=server.log) then return the SESSID from that event otherwise return null. dc will then count unique sessions.

 

Get Updates on the Splunk Community!

Automatic Discovery Part 1: What is Automatic Discovery in Splunk Observability Cloud ...

If you’ve ever deployed a new database cluster, spun up a caching layer, or added a load balancer, you know it ...

Real-Time Fraud Detection: How Splunk Dashboards Protect Financial Institutions

Financial fraud isn't slowing down. If anything, it's getting more sophisticated. Account takeovers, credit ...

Splunk + ThousandEyes: Correlate frontend, app, and network data to troubleshoot ...

 Are you tired of troubleshooting delays caused by siloed frontend, application, and network data? We've got a ...