Solved: Re: Nested searches

seanlon11 · ‎05-12-2010

Is there a way to pass parameters from one search to another search?

Scenario: Our WebSphere servers will sometimes receive a java.lang.OutOfMemoryError, and probably 90% of the time it will heal itself. The way to tell if it healed itself or not is to wait a few minutes after the initial OutOfMemoryError and then search to see if the server re-started.

1st query: host="was0" java.lang.OutOfMemoryError

2nd query: host="specificHostFromFirstQuery" "Start Display Current Environment"

In the 2nd query, I'd like to know the specific host that failed from the 1st query so I can sleep for a few minutes and then run the 2nd query.

So far, I have started to do this by having the 1st query kick off a script on my Unix server that sleeps for a few minutes, and then runs the 2nd query for all servers and then I grep through the data to see if I need to send an email.

Any thoughts?

Thanks, Sean

sideview · ‎05-12-2010

I dont think you actually need a subsearch here. There's a lot of things that only subsearches can do, but I dont think your case here is one of them (and your case is a pretty cool example). It's a good practice to avoid subsearches wherever possible.

Here's one way to do it:

java.lang.OutOfMemoryError OR "Start Display Current Environment" | transaction host maxspan=5m startswith="OutOfMemoryError" | search java.lang.OutOfMemoryError NOT "Start Display Current Environment"

Dissecting this step by step, the first search command gets you a big interleaved search result set.
Then the transaction will roll up everything to where each row in the results is from a single host but the 'event' text is now a big multiline string of all the events, always starting with the OutOfMemoryError, but never spanning more than 5 minutes total.
Then at the end we simply filter out the hosts where it did actually restart.
NOTE: I actually have to include the java.lang.OutOfMemoryError term again in the final search. The reason is that if a transaction terminates because it reaches the maxspan, it'll actually start a new transaction right then and there, which will have random events in it but no java.lang.OutOfMemoryError. Filtering again by OutOfMemoryError just filters that noise away.

Hopefully that'll work.

As an aside, this approach opens up several other avenues: notably if you put this before the transaction,

eval errorTime = if(searchmatch("OutOfMemoryError"), _time, null) | eval restartTime = if (searchmatch("Start Display Current Environment"), _time, null)

and then after the transaction and search you put eval timeBeforeRestart = restartTime-errorTime, then you can graph the timeBeforeRestart by host which could be a useful trick.

View solution in original post

rayfoo · ‎05-12-2010

Do provide more info and samples of the log events that you want to look out for. Are there only two cases that can occur (only OutOfMemoryError, or OutOfMemoryError followed by "Start Display Current Environment")? Would the OutOfMemoryError repeat itself indefinitely before the restart? etc...

sideview · ‎05-12-2010

I dont think you actually need a subsearch here. There's a lot of things that only subsearches can do, but I dont think your case here is one of them (and your case is a pretty cool example). It's a good practice to avoid subsearches wherever possible.

Here's one way to do it:

java.lang.OutOfMemoryError OR "Start Display Current Environment" | transaction host maxspan=5m startswith="OutOfMemoryError" | search java.lang.OutOfMemoryError NOT "Start Display Current Environment"

Dissecting this step by step, the first search command gets you a big interleaved search result set.
Then the transaction will roll up everything to where each row in the results is from a single host but the 'event' text is now a big multiline string of all the events, always starting with the OutOfMemoryError, but never spanning more than 5 minutes total.
Then at the end we simply filter out the hosts where it did actually restart.
NOTE: I actually have to include the java.lang.OutOfMemoryError term again in the final search. The reason is that if a transaction terminates because it reaches the maxspan, it'll actually start a new transaction right then and there, which will have random events in it but no java.lang.OutOfMemoryError. Filtering again by OutOfMemoryError just filters that noise away.

Hopefully that'll work.

As an aside, this approach opens up several other avenues: notably if you put this before the transaction,

eval errorTime = if(searchmatch("OutOfMemoryError"), _time, null) | eval restartTime = if (searchmatch("Start Display Current Environment"), _time, null)

and then after the transaction and search you put eval timeBeforeRestart = restartTime-errorTime, then you can graph the timeBeforeRestart by host which could be a useful trick.

seanlon11 · ‎05-13-2010

Thanks for the help. I will test and let you know how it goes.

sideview · ‎05-13-2010

Ha. that's a good point. I initially wasnt using maxspan, and was then searching for restartDuration<300 but the maxspan was simpler. I'll go back and delete them. Thanks.

rayfoo · ‎05-12-2010

Why eval errorTime and restartTime if they are never used subsequently? Just curious 🙂

Nested searches

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Think Like an Architect: Introducing the Splunk Certified Cybersecurity Defense ...

Best Practices: Splunk auto adjust pipeline queue

Announcing Modern Navigation: A New Era of Splunk User Experience

Join the Conversation