Splunk Search

How to match start and end dates with search?

sndegwa
Explorer

I have the following result from as search and would like help matching the start and end dates. These are two separate arrays of dates.

start time  end of start
6/1/2014 7:43   6/1/2014 8:42
6/2/2014 6:31   6/1/2014 8:52
6/3/2014 5:22   6/1/2014 9:16
6/4/2014 6:27   6/2/2014 7:30
6/5/2014 6:22   6/3/2014 6:19
6/6/2014 6:17   6/3/2014 10:33
6/7/2014 9:09   6/3/2014 18:08
6/8/2014 6:51   6/4/2014 7:05
6/9/2014 8:12   6/4/2014 7:13
6/10/2014 6:23  6/5/2014 6:58
6/14/2014 8:05  6/6/2014 6:55

The result I expect is shown below:

start time  end of start
6/1/2014 7:43   6/1/2014 8:42
6/2/2014 6:31   6/2/2014 7:30
6/3/2014 5:22   6/3/2014 6:19
6/4/2014 6:27   6/4/2014 7:05

..and so forth.

Note that I skip all start dates that occur before the most recent(previous) end of start and skip any end of start that occurs before the most recent(previous) start time to build my pairs of start and end. How do i accomplish this is a Splunk search?

0 Karma
1 Solution

sndegwa
Explorer

After some research I came up with what seems a good enough (not the best) solution for this problem. The issue was that I was framing the problem in the context of how I would solve it in an object oriented programming language and not SPL.I had two arrays of dates that I would need to loop through and create date pairs as my solution.
To reframe the problem in Splunk terms I had to combine the two arrays into one and keep them in their original chronological events and add a new column to the event to indicate whether it is a Start Time or an End Of Start event.So my list of events would now look like this:

Equipment   EventList   ctonline    cteos
CT1      6/1/2014 7:43  StartTime   
CT1      6/1/2014 8:42            End  Of Start
CT1      6/1/2014 8:52            End  Of Start
CT1      6/1/2014 9:16            End  Of Start
CT1      6/2/2014 6:31  StartTime   
CT1      6/2/2014 7:30            End  Of Start
CT1      6/3/2014 5:22  StartTime   
CT1      6/3/2014 6:19            End  Of Start
CT1      6/3/2014 10:33           End  Of Start
CT1      6/3/2014 18:08           End  Of Start
CT1      6/4/2014 6:27  StartTime   
CT1      6/4/2014 7:05            End  Of Start
CT1      6/4/2014 7:13            End  Of Start
CT1      6/5/2014 6:22  StartTime   
CT1      6/5/2014 6:58            End  Of Start

Given this I can then use the streamstats command to mark a valid StartTime event as one that occurs after an non-StartTime event and a valid End of Start event as one that occurs after a non-End of Start event.The result of these can be stored in two new columns as shown below:

Equipment   EventList       ctonline        cteos   prevctonline    prevcteos
CT1         6/1/2014 7:43   StartTime                   1   0
CT1         6/1/2014 8:42              End  Of Start    0   1
CT1         6/1/2014 8:52              End  Of Start    0   0
CT1         6/1/2014 9:16              End  Of Start    0   0
CT1         6/2/2014 6:31   StartTime                   1   0
CT1         6/2/2014 7:30               End  Of Start   0   1
CT1         6/3/2014 5:22   StartTime                   1   0
CT1         6/3/2014 6:19               End  Of Start   0   1
CT1         6/3/2014 10:33              End  Of Start   0   0
CT1         6/3/2014 18:08              End  Of Start   0   0
CT1         6/4/2014 6:27   StartTime                   1   0
CT1         6/4/2014 7:05               End  Of Start   0   1
CT1         6/4/2014 7:13               End  Of Start   0   0
CT1         6/5/2014 6:22   StartTime                   1   0
CT1         6/5/2014 6:58               End  Of Start   0   1

Now we can apply the transact command to group events that only fall between a valid StartTime and a valid End of Start event. Below is the code to achieve this.

...|streamstats current=f window=1 last(ctonline) as prevctonline last(cteos) as prevcteos | reverse | transaction Equipment startswith=eval(ctonline!=coalesce(prevctonline,"") and ctonline="Start") endswith=eval(cteos!=coalesce(prevcteos,"") and cteos="EndOfStart") | sort TimeStamp | timechart avg(duration) by Equipment

The result of the query above gives you the correct grouping for the Start Time and End of Start events .. without having to loop through the arrays as before.Hopefully this helps someone else.

View solution in original post

sndegwa
Explorer

After some research I came up with what seems a good enough (not the best) solution for this problem. The issue was that I was framing the problem in the context of how I would solve it in an object oriented programming language and not SPL.I had two arrays of dates that I would need to loop through and create date pairs as my solution.
To reframe the problem in Splunk terms I had to combine the two arrays into one and keep them in their original chronological events and add a new column to the event to indicate whether it is a Start Time or an End Of Start event.So my list of events would now look like this:

Equipment   EventList   ctonline    cteos
CT1      6/1/2014 7:43  StartTime   
CT1      6/1/2014 8:42            End  Of Start
CT1      6/1/2014 8:52            End  Of Start
CT1      6/1/2014 9:16            End  Of Start
CT1      6/2/2014 6:31  StartTime   
CT1      6/2/2014 7:30            End  Of Start
CT1      6/3/2014 5:22  StartTime   
CT1      6/3/2014 6:19            End  Of Start
CT1      6/3/2014 10:33           End  Of Start
CT1      6/3/2014 18:08           End  Of Start
CT1      6/4/2014 6:27  StartTime   
CT1      6/4/2014 7:05            End  Of Start
CT1      6/4/2014 7:13            End  Of Start
CT1      6/5/2014 6:22  StartTime   
CT1      6/5/2014 6:58            End  Of Start

Given this I can then use the streamstats command to mark a valid StartTime event as one that occurs after an non-StartTime event and a valid End of Start event as one that occurs after a non-End of Start event.The result of these can be stored in two new columns as shown below:

Equipment   EventList       ctonline        cteos   prevctonline    prevcteos
CT1         6/1/2014 7:43   StartTime                   1   0
CT1         6/1/2014 8:42              End  Of Start    0   1
CT1         6/1/2014 8:52              End  Of Start    0   0
CT1         6/1/2014 9:16              End  Of Start    0   0
CT1         6/2/2014 6:31   StartTime                   1   0
CT1         6/2/2014 7:30               End  Of Start   0   1
CT1         6/3/2014 5:22   StartTime                   1   0
CT1         6/3/2014 6:19               End  Of Start   0   1
CT1         6/3/2014 10:33              End  Of Start   0   0
CT1         6/3/2014 18:08              End  Of Start   0   0
CT1         6/4/2014 6:27   StartTime                   1   0
CT1         6/4/2014 7:05               End  Of Start   0   1
CT1         6/4/2014 7:13               End  Of Start   0   0
CT1         6/5/2014 6:22   StartTime                   1   0
CT1         6/5/2014 6:58               End  Of Start   0   1

Now we can apply the transact command to group events that only fall between a valid StartTime and a valid End of Start event. Below is the code to achieve this.

...|streamstats current=f window=1 last(ctonline) as prevctonline last(cteos) as prevcteos | reverse | transaction Equipment startswith=eval(ctonline!=coalesce(prevctonline,"") and ctonline="Start") endswith=eval(cteos!=coalesce(prevcteos,"") and cteos="EndOfStart") | sort TimeStamp | timechart avg(duration) by Equipment

The result of the query above gives you the correct grouping for the Start Time and End of Start events .. without having to loop through the arrays as before.Hopefully this helps someone else.

sndegwa
Explorer

Just updated the question with a link to the sample events and the logic behind the start and end times.

0 Karma

somesoni2
Revered Legend

Please share some raw events. It may help try other directions as well..

0 Karma

sndegwa
Explorer

Ok.Let me read through those in the meantime.Maybe my approach is wrong. Give it a crack and if need be I can post the native events that I searched to get the two arrays and see if there is another easier way around the problem.

0 Karma

somesoni2
Revered Legend

I will give this a try but it seems very complex for Splunk search language (needs while looping) and more or less asks for custom search command where you can do this looping in a python script. Some readings
http://blogs.splunk.com/2014/04/14/building-custom-search-commands-in-python-part-i-a-simple-generat...

http://blogs.splunk.com/2011/11/30/using-custom-search-commands-with-splunk-python-sdk/

0 Karma

sndegwa
Explorer

No really.After picking up the first start-pair (i.e "start time"- "end of start") a new start can occur only if its after the last "end of start" , if not then we would have to walk down the "start time" rows until we hit a start time that is AFTER the last "end of start". In my current list none of the starts will be skipped as they always occur after the previous pair's "end of start". But that contrary situation my happen.

0 Karma

sndegwa
Explorer

These are multivalued fields i.e multiple rows exist. The number of rows in in the "start time" field may not be necessary the same as those in the "end of start" field.In my current case I have 43 rows in "start time" and 129 in the "end of start" many of which will be skipped since they dont match my condition.

0 Karma

somesoni2
Revered Legend

The fields "start time" and "end of start" are multivalued field or single valued fields? how many rows are we talking about here?

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...