Splunk Search

How to combine my two searches get the duration of completed jobs with start/end events and display a list of incomplete jobs?

dpoloche
Explorer

I have two searches that I am trying to combine into one and keep running into roadblocks. Preferably, I would be able to accomplish the task without using transactions (second search). My goals are:

  1. See Jobs that are created, but never completed in the last 10 days
  2. Get the duration of current jobs in hours

My first search gives me the jobs completed/created, but it leaves out jobs created prior to the time window and is missing job duration.

   .....search.... EventDescription="Job Created" OR EventDescription="Job Completed"  
    | stats values(EventDescription) AS Event, list(_time) as Time by Job_ID 
    | eval Time=strftime(Time, "%m-%d-%y %H:%M:%S")

alt text

My second search gives me duration in hours, but uses the expensive transaction command and doesn't tell me about jobs that got created but never completed during the window.

 ....search .... EventDescription="Job Created" OR EventDescription="Job Completed" 
| transaction Job_ID startswith="Job Created" endswith="Job Completed" | where duration>0  
| eval StartTime=strftime(_time, "%m-%d-%y %H:%M:%S"), EndTime=strftime(_time+duration, "%m-%d-%y %H:%M:%S")   
| eval JobDuration = floor(duration/60/60)." Hours " | table Job_ID StartTime EndTime JobDuration

alt text

1 Solution

wpreston
Motivator

Personally, I feel that transaction deserves some love. I know it is an expensive command, but it is also a powerful one with some very useful features. If you decide to keep using transaction, you can simply add the keepevicted=t flag to the transaction command to get the information you want.

It will add a closed_txn field which denotes transactions that were:

  • Opened and closed (both the startswith and endswith conditions were met). In this case, closed_txn=1
  • Only opened (only the startswith condition was met and therefore the transaction was evicted). In this case, closed_txn=0

So to find jobs that started but didn't complete in your time window, add | search closed_txn=0 to your search after the transaction command. If the search is running too slowly, use fields to reduce your field extractions prior to running your transaction. The whole thing might look something like this:

....search .... EventDescription="Job Created" OR EventDescription="Job Completed" 
| fields Job_ID
| transaction Job_ID startswith="Job Created" endswith="Job Completed" keepevicted=t | where duration>0  
| eval Job_Status=if(closed_txn=0, "Incomplete", "Complete")
| eval StartTime=strftime(_time, "%m-%d-%y %H:%M:%S"), EndTime=strftime(_time+duration, "%m-%d-%y %H:%M:%S")   
| eval JobDuration = floor(duration/60/60)." Hours " 
| table Job_ID Job_Status StartTime EndTime JobDuration

If interested in this approach, have a look at keepevicted in the docs here.

View solution in original post

wpreston
Motivator

Personally, I feel that transaction deserves some love. I know it is an expensive command, but it is also a powerful one with some very useful features. If you decide to keep using transaction, you can simply add the keepevicted=t flag to the transaction command to get the information you want.

It will add a closed_txn field which denotes transactions that were:

  • Opened and closed (both the startswith and endswith conditions were met). In this case, closed_txn=1
  • Only opened (only the startswith condition was met and therefore the transaction was evicted). In this case, closed_txn=0

So to find jobs that started but didn't complete in your time window, add | search closed_txn=0 to your search after the transaction command. If the search is running too slowly, use fields to reduce your field extractions prior to running your transaction. The whole thing might look something like this:

....search .... EventDescription="Job Created" OR EventDescription="Job Completed" 
| fields Job_ID
| transaction Job_ID startswith="Job Created" endswith="Job Completed" keepevicted=t | where duration>0  
| eval Job_Status=if(closed_txn=0, "Incomplete", "Complete")
| eval StartTime=strftime(_time, "%m-%d-%y %H:%M:%S"), EndTime=strftime(_time+duration, "%m-%d-%y %H:%M:%S")   
| eval JobDuration = floor(duration/60/60)." Hours " 
| table Job_ID Job_Status StartTime EndTime JobDuration

If interested in this approach, have a look at keepevicted in the docs here.

dpoloche
Explorer

Both your answer and @Runal worked. Because it was a smaller subset, I ended up sticking with transactions. The fancy evals from @Runal inspired me to improve my dashboard panel!. I had to remove duration>0 in order to see jobs that had started but not finished. I really like the idea of | eval Job_Status=if(closed_txn=0, "Incomplete", "Complete"), however it was showing complete for jobs that had started but not finished.

....search... + EventDescription="Job Created" OR EventDescription="Job Completed"
 | transaction Job_ID startswith="*Job Created" endswith="Job Completed"  keeporphans=true 
 | eval since=round((now()-_time)/3600, 1), StartTime=strftime(_time, "%m-%d-%y %H:%M:%S"), EndTime=strftime(_time+duration, "%m-%d-%y %H:%M:%S")  
 | eval JobDuration = round(duration/3600, 1)." Hours ", EndTime=if(isnull(EndTime),"Job Currently Running",EndTime), JobDuration=if(isnull(JobDuration),"Currently running time is: ". since ." Hours",JobDuration) 
 | sort -_time | table Job_ID StartTime EndTime JobDuration

Runals
Motivator

Don't get me wrong. I do love and use the transaction command; it is very powerful and versatile. When it comes to just getting the duration of a job and depending on the volume of your data and timeframe of your search though clever use of stats can get you to the same ends with a little more efficiency.

wpreston
Motivator

Absolutely, I agree that stats is more efficient. I'm just a fan of transaction and thought to spread the word 🙂

0 Karma

Runals
Motivator

I similarly try to avoid transaction command when possible. I'd try something like this as a base

EventDescription="Job Created" OR EventDescription="Job Completed" | stats count min(_time) as JobCreated max(_time) as JobCompleted range(_time) as Duration by Job_ID

I think range will supply the requisite duration but if not I'll add a secondary eval. At any rate you could spruce up your search like the following.

EventDescription="Job Created" OR EventDescription="Job Completed" | stats count min(_time) as JobCreated max(_time) as JobCompleted range(_time) as Duration by Job_ID | eval Completed = if(count=2, "Yes", "No") | eval backupDuration = JobCompleted - JobCreated | eval moFancyDurationEval = case(JobCompleted - JobCreated > 3600 AND Completed="Yes", round((JobCompleted - JobCreated) / 3600, 1) . " hrs",  JobCompleted - JobCreated < 3600 AND Completed="Yes", round((JobCompleted - JobStarted) / 60, 1) . " min", 1=1, "Job not completed. Beat the hampsters for better performance")

sundareshr
Legend

Try this...

....search ... | reverse | streamstats window=2 current=f range(_time) as dur latest(_time) as end earliest(_time) as start by jobid | search "Job Created" | table _time, jobid, start, end, dur 
0 Karma
Get Updates on the Splunk Community!

What's New in Splunk Enterprise 9.4: Features to Power Your Digital Resilience

Hey Splunky People! We are excited to share the latest updates in Splunk Enterprise 9.4. In this release we ...

Take Your Breath Away with Splunk Risk-Based Alerting (RBA)

WATCH NOW!The Splunk Guide to Risk-Based Alerting is here to empower your SOC like never before. Join Haylee ...

SignalFlow: What? Why? How?

What is SignalFlow? Splunk Observability Cloud’s analytics engine, SignalFlow, opens up a world of in-depth ...