Splunk Search

Calculating Duration of Extracted Fields

pgates
Explorer

I'm having trouble getting a duration between two timestamps from some extracted fields.

My search looks like this:

 

MySearchCriteria index=MyIndex source=MySource
| stats list(ExtractedFieldStartTime) as MyStartTime, list(ExtractedFieldEndTime) as MyEndTime by AnotherField
| eval MyStartUnix=strptime(MyStartTime, "%Y-%m-%dT%H:%M:%S")
| eval MyEndUnix=strptime(MyEndTime, "%Y-%m-%dT%H:%M:%S")
| eval diff=MyEndUnix-MyStartUnix
| table MyStartTime MyEndTime MyStartUnix MyEndUnix diff

 


And my table is returned as:

MyStartTimeMyEndTimeMyStartUnixMyEndUnixdiff
2023-10-10T14:48:39
2023-10-10T14:15:15
1696963719.000000
1696961715.000000
 
2023-10-10T14:57:502023-10-10T13:56:531696964270.0000001696960613.000000 
Labels (2)
0 Karma
1 Solution

yuanliu
SplunkTrust
SplunkTrust

I have a strong suspicion that your mock output is misleading.  The correct mock output most likely look like this instead:

MyStartTime
MyEndTime
MyStartUnix
MyEndUnix
diff
2023-10-10T14:48:39
2023-10-10T14:57:50
2023-10-10T14:15:15
2023-10-10T13:56:53
1696974519.000000
1696975070.000000
1696972515.000000
1696971413.000000
 
 

In other words, instead of the two start and end pairs in two rows, they are in the same row for a given value of AnotherField which you didn't show.

This is because you use list function by AnotherField.  Very likely there are more than one start-end pairs per AnotherField.  These multivalued fields cannot be used in arithmetic operations directly.  Before I describe a method to handle multivalue fields, let me first get some clarifications.

  1. Is it really important to use list function?  Are there overlapping MyStartTime, overlapping MyEndTime, or overlapping intervals that magically end up in the correct sequence?  If not, using values is a lot cheaper and you won't be subject to memory limitations. (Because we are looking at ISO timestamps, values with order them correctly.)
  2. Is it really important to calculate diff after stats?  If you are listing/tallying values of every start-end pair, it is actually cheaper to calculate diff before stats. (If MyStartTime and MyEndTime don't appear in the same event, of course, you don't have a choice.)
  3. I cannot see real importance of listing MyStartUnix and MyEndUnix in final results, so the following will simply ignore them.

With these caveats, you can use mvmap to handle multivalued field after stats.  In the following, I assume that each start is paired with an end.

 

MySearchCriteria index=MyIndex source=MySource
| stats list(ExtractedFieldStartTime) as MyStartTime, list(ExtractedFieldEndTime) as MyEndTime by AnotherField
| eval idx = mvrange(0, mvcount(MyStartTime))
| eval diff=mvmap(idx, strptime(mvindex(MyEndTime, idx), "%Y-%m-%dT%H:%M:%S")-strptime(mvindex(MyStartTime, idx), "%Y-%m-%dT%H:%M:%S"))
| fields - idx

 

This will give you

AnotherField
MyStartTime
MyEndTime
diff
another
2023-10-10T14:48:39
2023-10-10T14:57:50
2023-10-10T14:15:15
2023-10-10T13:56:53
-2004.000000
-3657.000000

(Your samples have ends before starts, hence negative diffs.)

 

View solution in original post

Tags (1)

yuanliu
SplunkTrust
SplunkTrust

I have a strong suspicion that your mock output is misleading.  The correct mock output most likely look like this instead:

MyStartTime
MyEndTime
MyStartUnix
MyEndUnix
diff
2023-10-10T14:48:39
2023-10-10T14:57:50
2023-10-10T14:15:15
2023-10-10T13:56:53
1696974519.000000
1696975070.000000
1696972515.000000
1696971413.000000
 
 

In other words, instead of the two start and end pairs in two rows, they are in the same row for a given value of AnotherField which you didn't show.

This is because you use list function by AnotherField.  Very likely there are more than one start-end pairs per AnotherField.  These multivalued fields cannot be used in arithmetic operations directly.  Before I describe a method to handle multivalue fields, let me first get some clarifications.

  1. Is it really important to use list function?  Are there overlapping MyStartTime, overlapping MyEndTime, or overlapping intervals that magically end up in the correct sequence?  If not, using values is a lot cheaper and you won't be subject to memory limitations. (Because we are looking at ISO timestamps, values with order them correctly.)
  2. Is it really important to calculate diff after stats?  If you are listing/tallying values of every start-end pair, it is actually cheaper to calculate diff before stats. (If MyStartTime and MyEndTime don't appear in the same event, of course, you don't have a choice.)
  3. I cannot see real importance of listing MyStartUnix and MyEndUnix in final results, so the following will simply ignore them.

With these caveats, you can use mvmap to handle multivalued field after stats.  In the following, I assume that each start is paired with an end.

 

MySearchCriteria index=MyIndex source=MySource
| stats list(ExtractedFieldStartTime) as MyStartTime, list(ExtractedFieldEndTime) as MyEndTime by AnotherField
| eval idx = mvrange(0, mvcount(MyStartTime))
| eval diff=mvmap(idx, strptime(mvindex(MyEndTime, idx), "%Y-%m-%dT%H:%M:%S")-strptime(mvindex(MyStartTime, idx), "%Y-%m-%dT%H:%M:%S"))
| fields - idx

 

This will give you

AnotherField
MyStartTime
MyEndTime
diff
another
2023-10-10T14:48:39
2023-10-10T14:57:50
2023-10-10T14:15:15
2023-10-10T13:56:53
-2004.000000
-3657.000000

(Your samples have ends before starts, hence negative diffs.)

 

Tags (1)

PickleRick
SplunkTrust
SplunkTrust

With Splunk usually if something looks like a number but doesn't behave like a number it means that it's not a number but a string representation of a number and you have to tonumber() it. But in this case since strptime should give you a number it would be a bit surprising.

0 Karma

Law2
New Member

Hey you may need to change them to a string. Can  you try this and let me know if it works ( I am new to Splunk lol)

MySearchCriteria index=MyIndex source=MySource
| stats list(ExtractedFieldStartTime) as MyStartTime, list(ExtractedFieldEndTime) as MyEndTime by AnotherField
| eval MyStartUnix=strptime(MyStartTime, "%Y-%m-%dT%H:%M:%S")
| eval MyEndUnix=strptime(MyEndTime, "%Y-%m-%dT%H:%M:%S")
| eval diff= tostring((MyEndUnix - MyStartUnix),"duration")
| table MyStartTime MyEndTime MyStartUnix MyEndUnix diff

 

0 Karma

pgates
Explorer

No, that doesn't work.  I believe the reason it doesn't work is because it is just attempting to change the value of the equation (end - start) to a string, and that value appears to be empty for some reason.

I appreciate the try though.

0 Karma
Get Updates on the Splunk Community!

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...

Get the T-shirt to Prove You Survived Splunk University Bootcamp

As if Splunk University, in Las Vegas, in-person, with three days of bootcamps and labs weren’t enough, now ...

Wondering How to Build Resiliency in the Cloud?

IT leaders are choosing Splunk Cloud as an ideal cloud transformation platform to drive business resilience,  ...