Splunk Search

How to filter data in a search using the AND condition?

athorat
Communicator

We are using this search to find out the Hadoop jobs which have finished.
Each finished job has a start event and an end event.
Start Event : ":start:] with user-retry state"
END Event : "@end***]Action updated in DB!"

The problem with the below search is we are using an OR condition. How do we filter data only when both conditions match?

index=aap_prod sourcetype="HDP:PROD:OOZIE"  (":start:] with user-retry state" OR "@end***]Action updated in DB!") | rex "TOKEN\[\] APP\[(?[^\]]*)"|   rex "ACTION\[[^\@]*(?[^\d\]]*)" | rex "JOB\[?(?[\d-]+)-"|rename JobId as "Job ID",JobName as "Job Name"  | sort _time | streamstats current=t  window=2 range(_time) as JobRunTime latest(_time) as EndTime earliest(_time) as StartTime |sort -_time | table "Job Name", "Job ID",,StartTime , EndTime  ,JobRunTime|dedup "Job ID"| eval StartTime=strftime(StartTime, "%c")|eval EndTime=strftime(EndTime, "%c")|eval JobRunTime=tostring(JobRunTime, "duration")  |search JobRunTime!=0
0 Karma
1 Solution

Richfez
SplunkTrust
SplunkTrust

It sounds like start and end events are in different events, so you could replace a lot of what you are doing with a transaction. One requirement of this is that you need an identifier to "join them up on", like a JobID. I get hints in your search that this is the case, so I'll use JobID for that for now. I'm also assuming that there's a maximum of 15 minutes between a job start and a job end. Adjust as required.

I'm going to write this a couple of lines at a time so I can explain, because there's some other fixups and changes I made. PLUS, you can run this and add one piece at a time to see the affect it has, in fact, I very heartily recommend it.

index=aap_prod sourcetype="HDP:PROD:OOZIE" 
| transaction JobID startswith=":start:] with user-retry state"  endswith="@end***]Action updated in DB!" maxspan=15m
| search duration!=0
| put your rex for TOKEN, ACTION and JOB here | then your rename of Job ID and Job Name

First is your base search which pulls out ALL the events involved with this. We can tweak if necessary, but there's no need to be too restrictive.

Then comes the transaction. Read up on it - its very useful. It will "connect together" into a single event all the individual events that should go together. You for free get a field called duration field to replace all your JobRunTime and stuff. I don't think you'll need to escape any of the characters in there, but we may find differently. We will see.

Third is a search to find only transactions with a duration greater than zero (replacing your end-of-search JobRunTime!=0).

Last is the rexes you need for those fields and the two renames. (A side note - these events appear to have some absolute crazy formatting and puncuatation in them given the startswith and endswith stuff, so I won't comment on the possible oddity of the rexes - just paste 'em in if they work and go on with life - they'll still work if they did before!).

Now, continuing along, here's some things you for sure do NOT need:

| sort _time 
| streamstats current=t  window=2 range(_time) as JobRunTime latest(_time) as EndTime earliest(_time) as StartTime 
| sort -_time 

Those aren't needed because they were used before to calculate the job run time, which you already have as "duration."

Now, the start and ending time... we need to keep a bit of this, but I'm going to change it some. You may not need this at all, in which case I would suggest to not add it.

| stats earliest(_time) as StartTime, latest(_time) as EndTime
| eval StartTime=strftime(StartTime, "%c")|eval EndTime=strftime(EndTime, "%c")
| eval JobRunTime=tostring(duration, "duration")  

I used stats to get your earliest and latest times for each event, then the rest is probably mostly OK if you need to do that.

So all together:

 index=aap_prod sourcetype="HDP:PROD:OOZIE" 
 | transaction JobID startswith=":start:] with user-retry state"  endswith="@end***]Action updated in DB!" maxspan=15m
 | search duration!=0
 | put your rex for TOKEN, ACTION and JOB here | then your rename of Job ID and Job Name
 | stats earliest(_time) as StartTime, latest(_time) as EndTime
 | eval StartTime=strftime(StartTime, "%c")|eval EndTime=strftime(EndTime, "%c")
 | eval JobRunTime=tostring(duration, "duration")

Now, just as a note there was a bit to fix up in there, so hopefully it all works, but if not please take a little time and build it one piece at a time, seeing what works as I describe it and what doesn't, then ask about the parts that don't work. We can work through this, it just might take a little bit of back and forth.

ALSO, it might be useful if this doesn't work perfectly (or with trivial changes) to paste in an event or two - sometimes seeing the raw data can help immensely. But hopefully it'll pretty much work.

View solution in original post

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...