I have a query that frequently times out due to the subsearch time limit. I'd like to improve it's performance but I'm not sure how. Here's my query:
host=prod* source=user-activity.log sourcetype=log4j ID=uniqueID MESSAGE="LOGIN_SUCCESS*"| stats count as Logins by Full_Date, ID, DName, STATE | join type=left ID [ search host=prod* source=server.log sourcetype=log4j MESSAGE="[Dashboard User-Facing*" ID=uniqueID | stats count as Errors by Full_Date ,ID, DName, STATE ]|eval %=round((100*Errors)/Logins,0) |table ID, DName, Full_Date, STATE, Errors, Logins,%
Any help would be greatly appreciated.
It's quite straightforward to get this into a single search, by searching data_set_1 OR data_set_2 then evaluating the data set in the stats command like this
host=prod* sourcetype=log4j (source=user-activity.log ID=uniqueID MESSAGE="LOGIN_SUCCESS*") OR (source=server.log MESSAGE="[Dashboard User-Facing*" ID=uniqueID)
| stats sum(eval(if(MESSAGE="LOGIN_SUCCESS", 1, 0))) as Logins sum(eval(if(source=server.log, 1, 0))) as Errors by Full_Date, ID, DName, STATE
| eval %=round((100*Errors)/Logins,0)
| table ID, DName, Full_Date, STATE, Errors, Logins,%
In the stats/eval you are checking for the message login success to indicate a login data event and then the source for the error data set.
It's quite straightforward to get this into a single search, by searching data_set_1 OR data_set_2 then evaluating the data set in the stats command like this
host=prod* sourcetype=log4j (source=user-activity.log ID=uniqueID MESSAGE="LOGIN_SUCCESS*") OR (source=server.log MESSAGE="[Dashboard User-Facing*" ID=uniqueID)
| stats sum(eval(if(MESSAGE="LOGIN_SUCCESS", 1, 0))) as Logins sum(eval(if(source=server.log, 1, 0))) as Errors by Full_Date, ID, DName, STATE
| eval %=round((100*Errors)/Logins,0)
| table ID, DName, Full_Date, STATE, Errors, Logins,%
In the stats/eval you are checking for the message login success to indicate a login data event and then the source for the error data set.
@bowesmana Thank you!
Quick question though. How would I dedup just 1 of the 2 searches by SESSID? stats dc? I had previously used eventstats dc(SESSID) as Errors but can't seem to work it into the new search.
Depends on what that dedup is supposed to be doing. Is SESSID a unique session ID that is found in errors and are you trying to count unique sessions with errors?
Do both data sets have a SESSID field?
If you want to get unique sessions for the error data set, the something like this in the existing stats command
dc(eval(if(source="server.log", SESSID, null()))) as SessionsWithErrors
which is saying if it is the errors data set (source=server.log) then return the SESSID from that event otherwise return null. dc will then count unique sessions.