Splunk Search

Adjust time difference for suspended and unsuspended events

1067062
New Member

I am trying to find a solution for adjust my time interval for time to resolve. There are two indexes being used, the first captures the summary records with CLOSE_TIME and OPEN_TIME, the second captures the supporting activities SUSPENDED and UNSUSPENDED.

The summary events have the following fields - this is the summary index (lets call it index=smmary)
ID
OPEN_TIME
CLOSE_TIME
SUSPEND_COUNT

The activities events have the following fields - this is the activities index (lets call it index=activities)
ID - matches the ID above in the summary table
DATESTAMP - time that activity happened
TYPE - whether it was suspended or unsuspended

I am trying to adjust the summary data for the matching activities in in order to adjust CLOSE_TIME - OPEN_TIME to reflect the actual time that a ticket was open for - using TYPE=SUSPENDED and TYPE=UNSUSPENDED , and pulling their DATESTAMP field to make the adjustment.

Most examples of calculating duration between events dont help here. The tricky part is that for each ID in index=summary, I have to check each index=activities for a mathcing ID, and if it matches I have to handle TYPE=SUSPENDED AND TYPE=UNSUSPENDED differently. I have to take the difference of UNSUSPENDED - SUSPENDED and subtract that amount from the original CLOSE_TIME - OPEN_TIME difference in the index=summary.

The other tricky part is that there can be any number of SUSPENDED and UNSUSPENDED events for a given ID, or there can be none at all.

These last two challenges have been what separates my use case from the others I have found.

Here is the SPL I have written so far:

index=main source=summary
| dedup ID
| join type=left max=0 INCIDENT_ID [search index=main source="activities" TYPE=Suspended TYPE=UNSUSPENDED]
| eval openTimeNum = strptime(OPEN_TIME, "%Y-%m-%d %H:%M:%S")
| eval closeTimeNum = strptime(CLOSE_TIME, "%Y-%m-%d %H:%M:%S")
| eval mttr_unadjusted = (closeTimeNum-openTimeNum)
| eval finalTimeInterval = 0
| foreach DATESTAMP [
eval curTimeEvent = strptime(<>, "%Y-%m-%d %H:%M:%S")
| eval finalTimeInterval = if(isnull(DATESTAMP), mttr_unadjusted, if(index=0, curTimeEvent - closeTimeNum, curTimeEvent - finalTimeInterval))
| eval index = index + 1
]

This doesnt work obviously, I was just playing with it, but this is the base I have so far. I added the index variable as an attempt to monitor how many iterations the foreach loop has used.

I have attempted using transactions, though I am not having success when using the following SPL (to be honest I am not exactly sure what is happening here, and even the transaction command is capable of handling the logic I require):

index=main source=summary OR source=activities TYPE=Suspended OR TYPE=UNSUSPENDED
| transaction ID
| chart count by duration

---> Yields the following results: (The zero value has a count because there are some summary records that do not have any supporting activities)
duration count
0 2563
18 3
19 6
20 3
21 3
23 4
24 2
25 1
28 3
29 2
31 4
32 3
34 1
35 1
36 1
37 2

Any help would be appreciated, as I just need to know at least where to look. So far I still believe that foreach is the best option (but that is just because I am a native javascript developer and am used to handling logic that way)

Thank you for your time and consideration!

0 Karma

DalJeanis
SplunkTrust
SplunkTrust

First, try something like this -

index=main (source=activities) (TYPE="Suspended" OR TYPE="UNSUSPENDED")
| transaction ID endswith="UNSUSPENDED" startswith="Suspended"
| eval duration = if(duration==0,now()-_time,duration)

You can limit that to a couple of IDs initially, and verify that the pairs of suspended and unsuspended and duration look reasonable. Also check, for a transaction that has only one event (suspended) that the calculated duration in seconds since being suspended looks right.

Next, run that into this -

| stats sum(duration) as minusduration, sum(eventcount) as eventcount, count as suspendcount by ID 

Now you have the total amount of minusduration you have to remove from the other query.


NOTE - Actually, your zero count are probably the records that have been suspended once and never yet unsuspended.

0 Karma

1067062
New Member

I think this works as I need:

    index=main source="activities" (TYPE=Suspended OR TYPE=UNSUSPENDED)
    | eval TYPE=lower(TYPE) 
    | fields ID TYPE 
    | join type=left max=0ID 
       [ search index=main source=summary
       | dedup ID 
       | fields ID OPEN_TIME CLOSE_TIME] 
    | search CLOSE_TIME=* NOT CLOSE_TIME=4000* 
    | eval openTimeNum = strptime(OPEN_TIME, "%Y-%m-%d %H:%M:%S") 
    | eval closeTimeNum = strptime(CLOSE_TIME, "%Y-%m-%d %H:%M:%S") 
    | eval mttr_unadjusted_in_hrs = (closeTimeNum-openTimeNum)/(60*60) 
    | sort ID _time 
    | streamstats current=f last(_time) as LastTime by ID 
    | eval duration_in_hrs = if(TYPE="unsuspended", (_time-LastTime)/(60*60), 0) 
    | eventstats sum(duration_in_hrs) as total_suspend_in_hrs by ID 
    | eval mttr_adjusted=mttr_unadjusted_in_hrs - total_suspend_in_hrs 
    | dedup ID 
    | table ID _time LastTime duration_in_hrs total_suspend_in_hrs mttr_unadjusted_in_hrs mttr_adjusted  OPEN_TIME CLOSE_TIME TYPE
0 Karma

1067062
New Member

I think this makes more sense here, based on what you put:

    index=main source=summary
    | dedup ID
    | join type=left max=0 ID [search index=main source=activities (TYPE=Suspended OR TYPE=UNSUSPENDED)]
    | transaction TYPE endswith="Unsuspended" startswith="Suspended" maxevents=2
    | stats sum(duration) AS minusduration BY ID

This still returns many more 0 durations than expected. I am not sure how to interpret those 0 results.

0 Karma

1067062
New Member

Thanks for the response. I am going to give this a shot and observe the results. I am looking at your SPL, and the endswith="UNSUSPENDED" may not be accurate as there could be many UNSUSPENDED for a single ID. It doesnt necessarily end with the first UNSUSPENDED. However, I am looking at the latter part of your SPL, and then you use stats to combine everything by ID again.

So if I am understanding your SPL correctly, then you trasact every Suspend and Unsuspend grouping, get the duration, then you combine them all by ID. This could work, just have to test. Implementing your SPL, I would then have to subtract the minusduration from the original duration.

Let me test and I will get back to you. Thanks!

0 Karma

1067062
New Member

Just using your code, this is the output:
ID, minusduration, eventcount, suspendcount
ID1 193453 1 1
ID2 170153 1 1
ID3 184895 1 1
ID4 193753 1 1
ID5 160602 1 1
ID6 182344 1 1
ID7 107694 1 1
ID8 98973 1 1
ID9 97477 1 1
ID10 160636 1 1
ID11 107997 1 1
ID12 171899 1 1
ID13 77024 1 1
ID14 82132 1 1
ID15 182065 1 1
ID16 171443 1 1
ID17 77113 1 1
ID18 223942 2 2
ID19 107109 1 1
ID20 210801 1 1

0 Karma

1067062
New Member

Upon further testing, I see all events have a duration of 0 when this line is removed: | eval duration = if(duration==0, now()-_time, duration)

I think that is because TYPE is not specified in the field list of the transaction command. I added it, but still 0 values.

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...