Solved: Re: Calculate transaction time for repeating event...

marckg · ‎06-10-2015

Hi All,
I'm just getting started with Splunk, and am having a problem calculating the time for repeating values from a device that only logs its current state. The source looks like this:

    06-08-2015 08:00:01.650; ICU ;ISsC3
    06-08-2015 08:00:03.981; ICU ;ISsC3
    06-08-2015 08:00:04.275; ICU ;ISsC3
...
    06-08-2015 08:05:08.175; ICU ;ISsC18
    06-08-2015 08:05:04.575; ICU ;ISsC18

I am interested in getting the delta (or summing the time in C3 state) from the first ISsC3 and the first event that's not ISsC3, skipping over the middle set, and the next set of non-C3 events, and summarizing these as sessions.

My search looks like this:

mysearch ...|transaction startswith=IcuState=C3 endswith=IcuState!=C3 |table _time duration

I don't have a session ID, just a start and stop event for each log file. The output sees to be a pairing of every C3 to every non-C3, I'm fine deduping the middle data, since it's not interesting for this search.

What am I missing?

Richfez · ‎06-10-2015

I found a different technique. I'm not sure how well this will scale, but it seems like it might not be too bad.

I created a sample csv file that has a bunch of rows with two values, a time (just to make it spread out a bit time-wise) and a "field2" which is variously some letter, each repeated a few times so my data looks sort of like yours only simpler. An excerpt follows so you can see where I was coming from so that you can modify it to what you need (there's quite a few more rows than this):

6/10/2015 8:29  A
6/10/2015 8:32  A
6/10/2015 8:35  B
6/10/2015 8:37  B
6/10/2015 8:40  B
6/10/2015 8:43  B
6/10/2015 8:46  C
6/10/2015 8:49  C
6/10/2015 8:52  C
6/10/2015 8:55  C
6/10/2015 8:58  A
6/10/2015 9:00  A

Now, it seems all attempts at doing ... |transaction field2 fails because it links together ALL your C's with each other, when you want it split between other letters. I worked down a few paths and finally found something that works.

The search index="throw_away" | streamstats window=2 distinct_count(field2) as splitter | table field1, field2, splitter gives me

6/10/2015 10:53     C   1
6/10/2015 10:50     C   1
6/10/2015 10:47     C   1
6/10/2015 10:44     C   1
6/10/2015 10:41     B   2
6/10/2015 10:38     B   1
6/10/2015 10:36     B   1

You can see that when field2 switched, "splitter" is now 2. That's because I'm using a window of two events and counting the distinct values for field2. So, C->C is one count, but when the C->B transition happens distinct count is two.

Adding a transaction on field2 with a startswith=splitter tossed in there: index="throw_away" | streamstats window=2 distinct_count(field2) as splitter | transaction startswith=splitter>1 field2 now gives me transactions of each "group" independently because it's starting whenever splitter is greater than one, but connecting events on field2.

6/10/15 9:03:00.000 AM  
    6/10/2015 9:03,A
    6/10/2015 10:21,A
    6/10/2015 10:24,A
    6/10/2015 10:27,A

6/10/15 8:55:00.000 AM  
    6/10/2015 8:55,C
    6/10/2015 9:06,C
    6/10/2015 9:09,C
    6/10/2015 9:12,C
    6/10/2015 9:15,C

And finally, we only want those transactions where field2 is a "C", so we can strip out all the rest. Then I pipe it to a table to make it pretty... index="throw_away" | streamstats window=2 distinct_count(field2) as splitter | transaction startswith=splitter>1 field2 | search field2="C" | table _time, field2, duration

_time   field2  duration
2015-06-10 10:18:00     C   2100
2015-06-10 09:35:00     C   2400 
(and so on...)

How does that look?

View solution in original post

ppablo · ‎05-04-2016

Hi @marckg

I noticed you were active on the site a couple days ago, but this post is almost 1 year old. Can you please revisit this post and confirm whether or not the answer by @rich7177 solved your issue?

Richfez · ‎06-10-2015

I found a different technique. I'm not sure how well this will scale, but it seems like it might not be too bad.

I created a sample csv file that has a bunch of rows with two values, a time (just to make it spread out a bit time-wise) and a "field2" which is variously some letter, each repeated a few times so my data looks sort of like yours only simpler. An excerpt follows so you can see where I was coming from so that you can modify it to what you need (there's quite a few more rows than this):

6/10/2015 8:29  A
6/10/2015 8:32  A
6/10/2015 8:35  B
6/10/2015 8:37  B
6/10/2015 8:40  B
6/10/2015 8:43  B
6/10/2015 8:46  C
6/10/2015 8:49  C
6/10/2015 8:52  C
6/10/2015 8:55  C
6/10/2015 8:58  A
6/10/2015 9:00  A

Now, it seems all attempts at doing ... |transaction field2 fails because it links together ALL your C's with each other, when you want it split between other letters. I worked down a few paths and finally found something that works.

The search index="throw_away" | streamstats window=2 distinct_count(field2) as splitter | table field1, field2, splitter gives me

6/10/2015 10:53     C   1
6/10/2015 10:50     C   1
6/10/2015 10:47     C   1
6/10/2015 10:44     C   1
6/10/2015 10:41     B   2
6/10/2015 10:38     B   1
6/10/2015 10:36     B   1

You can see that when field2 switched, "splitter" is now 2. That's because I'm using a window of two events and counting the distinct values for field2. So, C->C is one count, but when the C->B transition happens distinct count is two.

Adding a transaction on field2 with a startswith=splitter tossed in there: index="throw_away" | streamstats window=2 distinct_count(field2) as splitter | transaction startswith=splitter>1 field2 now gives me transactions of each "group" independently because it's starting whenever splitter is greater than one, but connecting events on field2.

6/10/15 9:03:00.000 AM  
    6/10/2015 9:03,A
    6/10/2015 10:21,A
    6/10/2015 10:24,A
    6/10/2015 10:27,A

6/10/15 8:55:00.000 AM  
    6/10/2015 8:55,C
    6/10/2015 9:06,C
    6/10/2015 9:09,C
    6/10/2015 9:12,C
    6/10/2015 9:15,C

And finally, we only want those transactions where field2 is a "C", so we can strip out all the rest. Then I pipe it to a table to make it pretty... index="throw_away" | streamstats window=2 distinct_count(field2) as splitter | transaction startswith=splitter>1 field2 | search field2="C" | table _time, field2, duration

_time   field2  duration
2015-06-10 10:18:00     C   2100
2015-06-10 09:35:00     C   2400 
(and so on...)

How does that look?

Richfez · ‎06-10-2015

Will the optional "unifyends=t" for your transactions help?

mysearch ... |transaction IcuState unifyends=t

That may get you transactions based on IcuState, but in the way you want them. Once you have that, you could probably try

mysearch ... |transaction IcuState unifyends=t  | search IcuState=C3

Which will then include only the transactions of the right IcuState.

Does that help?

marckg · ‎06-10-2015

No, it didn't really help, even when I qualified the transaction with startswith and endswith 😞

Richfez · ‎06-10-2015

You want to know the length of state C3 as it is exists between other states?

So, trimming your example down to bare bones, if you have C3, C3, C3, C18, C18, C3, C3, C3, C3, C12, C3, C3, C3, then you want to end up with 3 durations, using the transitions to other states as your key to closing each transaction but otherwise ignoring those other states? (I don't know if C12 is valid, I just made it up...)

marckg · ‎06-10-2015

Yes, exactly what I'm looking for.

Calculate transaction time for repeating events

What the End of Support for Splunk Add-on Builder Means for You

Solve, Learn, Repeat: New Puzzle Channel Now Live

Building Reliable Asset and Identity Frameworks in Splunk ES

Are you a member of the Splunk Community?

Calculate transaction time for repeating events

What the End of Support for Splunk Add-on Builder Means for You

Solve, Learn, Repeat: New Puzzle Channel Now Live

Building Reliable Asset and Identity Frameworks in Splunk ES