Splunk Search

Grouping the events

Contributor

My requirement is to group events (list of jobs) based on their status.

The status value starts with RUNNING and may end with SUCCESS OR FAILURE.

I want make a table which shows the list of jobs along with the start time, end time, and current status (whether RUNNING, SUCCESS, OR FAILURE)

My table should display RUNNING status until the job ends up with SUCCESS or FAILURE.

In case any one of the jobs is FAILED, the status should be FAILURE. Once the issue is fixed and the same job is status=SUCCESS, I want my table to display SUCCESS. It should be a single entry for the job. Not separate entry for the failed one and the success one.

I have tried two methods as mentioned below, but I'm finding a defect in both of the methods.

Method 1:

index=XXX sourcetype=yyy autosys_job=* autosys_status=* | transaction autosys_job keepevicted=true startswith=RUNNING endswith=eval((match autosys_status, "SUCCESS") OR (match autosys_status, "FAILURE")) |eval  starttime=_time | eval endtime=_time+duration | stats  last(autosys_status) AS CurrentStatus by autosys_job starttime endtime

The issue I am facing with this method is in case of any job getting failed and then running to success after sometime. There are separate entries displayed for the FAILED and SUCCESS, but I don't want to display the failed entry once it is run to success.

Method 2:

 index=XXX sourcetype=yyy autosys_job=* autosys_status=* | stats latest(autosys_status) AS currentstatus, earliest(_time) AS Starttime, latest(_time) AS Endtime by autosys_job

Issue I have an issue with this method when the cause is failing.
i.e. one job starts running then fails and after some time it is running to success. Here start time in the query takes initial start time (failed one) and endtime in the query takes the latest time from success one. But I want to display the start time and end time for the success run.

Please let me know how can I resolve this one.

0 Karma
1 Solution

SplunkTrust
SplunkTrust

It was good you tried to avoid transaction, which you should avoid whenever you can. You just need to prep the records a bit before executing the stats command.

Try this...

index=XXX sourcetype=yyy autosys_job=* autosys_status=* 
| fields autosys_job autosys_status
| rename COMMENT as "the above selects the desired events and limits them to desired fields.  _time is also retained"

| rename COMMENT as "We find the last time each job started, and copy it onto all records for that job"
| rename COMMENT as "Then get rid of any RUNNING, SUCCESS OR FAILURE from prior runs"
| eval last_start=if(autosys_status="RUNNING",_time,null())
| eventstats latest(last_start) as last_start by autosys_job
| where _time >= last_start

| rename COMMENT as "The only remaining records are from the current run.  End_time will come from any SUCCESS or FAILURE, and be empty otherwise."
| eval last_end=if(autosys_status="SUCCESS" OR autosys_status="FAILURE",_time, null())
| stats min(last_start) as StartTime, max(last_end) as EndTime, 
    latest(autosys_status) as CurrentStatus by autosys_job  

updated last(last_start) to latest(last_start).

View solution in original post

0 Karma

SplunkTrust
SplunkTrust

It was good you tried to avoid transaction, which you should avoid whenever you can. You just need to prep the records a bit before executing the stats command.

Try this...

index=XXX sourcetype=yyy autosys_job=* autosys_status=* 
| fields autosys_job autosys_status
| rename COMMENT as "the above selects the desired events and limits them to desired fields.  _time is also retained"

| rename COMMENT as "We find the last time each job started, and copy it onto all records for that job"
| rename COMMENT as "Then get rid of any RUNNING, SUCCESS OR FAILURE from prior runs"
| eval last_start=if(autosys_status="RUNNING",_time,null())
| eventstats latest(last_start) as last_start by autosys_job
| where _time >= last_start

| rename COMMENT as "The only remaining records are from the current run.  End_time will come from any SUCCESS or FAILURE, and be empty otherwise."
| eval last_end=if(autosys_status="SUCCESS" OR autosys_status="FAILURE",_time, null())
| stats min(last_start) as StartTime, max(last_end) as EndTime, 
    latest(autosys_status) as CurrentStatus by autosys_job  

updated last(last_start) to latest(last_start).

View solution in original post

0 Karma

Contributor

Wow!!! Awesome!!!! @DalJeanis. It did the magic!! If framing the complex queries its an art, you are picasso in it.

With the small modification in step 8 my need served.

 | eventstats latest(last_start) as last_start by autosys_job

SplunkTrust
SplunkTrust

@Kwip - Thanks! Good catch. FYI, I had removed a sort and streamstats, switching to eventstats, and had neglected to change last() to latest().