Archive
Highlighted

How to calculate duration of overlapping events from multiple Services

Path Finder

I have been working on this for quite sometime and it appears I am just going in circles. Maybe some Splunk Savant will be able to work the kinks out.

I have a set of normalized data which contains the starttime, endtime, AppName, InstanceName, Type, EventName, duration. My data looks like this:

1535432400, 1535432700, App1, measure, 1, _m_WS_Time ws/cb_App1_Requests.v4_1.ws.producer.App1_requests/cb_App1_Requests_v4_1_ws_producer_App1_requests_Port?_getTaskList, 300,1535436019_0
1535443200, 1535443500, App1, measure, 1, _m_WS_Time ws/cb_App1_Requests.v4_1.ws.producer.App1_requests/cb_App1_Requests_v4_1_ws_producer_App1_requests_Port?_getTaskList, 300,1535446818_0
1535446800, 1535447100, App1, measure, 1, _m_WS_Time ws/cb_App1_Requests.v4_1.ws.producer.App1_requests/cb_App1_Requests_v4_1_ws_producer_App1_requests_Port?_getTaskList, 300,1535450417_0
1535447730, 1535448030, App4, alvelca01, 1, App4 Doc_Admin_Prod High PurePath Response Time, 300,1535641220_4
1535468400, 1535469000, App1, measure, 1, _m_WS_Time ws/cb_App1_Requests.v4_1.ws.producer.App1_requests/cb_App1_Requests_v4_1_ws_producer_App1_requests_Port?_getTaskList, 600,1535472019_0
1535471219, 1535474819, App2, ualbuacwas6, 1, App2 Online - High Active Thread Count, 3600,1535472017_0
1535471219, 1535474819, App2, ualbuacwas5, 1, App2 Online - High Active Thread Count, 3600,1535472017_0
1535471269, 1535474869, App2, ualbuacwas7, 1, App2 Online - High Active Thread Count, 3600,1535472017_0
1535471319, 1535474919, App2, ualbuacwas6, 1, High App2 WCAX JDBC Pool Percent Usage, 3600,1535472017_1
1535471319, 1535471449, App2, ualbuacwas7, 1, High App2 WCAX JDBC Pool Percent Usage, 130,1535472017_1
1535479849, 1535483449, App2, ualbuacwas5, 1, High App2 JDBC Pool Percent Usage, 3600,1535482816_1
1535481100, 1535481103, App3, ip-10-14-6-210.ec2.internal, 1, Application Process Unavailable (unexpected), 3,1535482817_0
1535481100, 1535481107, App3, ip-10-14-6-44.ec2.internal, 1, Application Process Unavailable (unexpected), 7,1535482817_1
1535481164, 1535481165, App4, alvelcw01, 1, Application Process Unavailable (unexpected), 1,1535641220_3
1535481348, 1535484948, App2, ualbuacwas8, 1, App2 Online - Hung Threads, 3600,1535482816_2
1535481348, 1535484948, App2, ualbuacwas7, 1, App2 Online - Hung Threads, 3600,1535482816_2
1535481348, 1535484948, App2, ualbuacwas6, 1, App2 Online - Hung Threads, 3600,1535482816_2
1535512218, 1535512288, App2, ualbuacwas5, 1, Application Process Unavailable (unexpected), 70,1535515215_0

I have tried to use concurrency with transaction:

base search ....
| concurrency start=stime duration=duration output=overlay
| table _time Service EventName duration overlay

The concurrency command is not splitting the Services out. But, now that I've looked at it, it shouldn't. It's calculating the concurrency across all overlaps not by Service overlaps.

What I am looking for is the durations of the overlaps by Service. Alot like what the Timeline visualization does.

Example:
The first event for App1 starts at 10:30am and its duration is 300 seconds. The next event for App1 starts at 10:32 for 300 seconds, etc, etc,etc. I want the time for the Service's total durtion of events from the first overlapping event to the last. To throw a wrench into the mix. Some events for a service so not overlap and they have to be measured individually because they don't overlap.

Any help at this point would be a bonus.

Thanks in advance.

0 Karma
Highlighted

Re: How to calculate duration of overlapping events from multiple Services

SplunkTrust
SplunkTrust

Depending on use case, there are a couple of ways to go. Basically, at a very high level, you create events for the start of each service that add 1 to the number of concurrent services, and for the end that subtract 1.

Remember. when you get the underlying events, that you need to create a start for any service that ENDS in your time period. If there is not already a start, create one that has a time immediately before the time period.

Here's a couple of fully documented examples.

https://answers.splunk.com/answers/513002/how-to-graph-sum-of-overlapping-values-given-start.html

https://answers.splunk.com/answers/577850/how-to-recordcalculate-the-duration-of-overlapping.html

View solution in original post

0 Karma
Highlighted

Re: How to calculate duration of overlapping events from multiple Services

Path Finder

The second hyperlink was where the gold was and as a side note, it was another post from me that was answered. How it all works together. It's amazing.

Ok, back to Splunking.

Thanks for the help.

0 Karma
Highlighted

Re: How to calculate duration of overlapping events from multiple Services

Path Finder

Hi DalKeanis,
I do have a same requirement of calculating downtime duration withing overlaping events. As per below data, downtime start at ( 09:05:41.031) and ends at (09:14:16.802).
Basically I want to skip those UP events if any other Service goes down before it.

Could you please help to build the query?

2019-03-27 09:05:41.031 Service1 DOWN
2019-03-27 09:05:43.783 Service
2 DOWN
2019-03-27 09:06:13.332 Service3 DOWN
2019-03-27 09:07:32.118 Service
1 UP
2019-03-27 09:07:34.742 Service1 DOWN
2019-03-27 09:07:40.743 Service
2 UP
2019-03-27 09:07:41.594 Service1 UP
2019-03-27 09:07:45.288 Service
1 DOWN
2019-03-27 09:07:51.441 Service1 UP
2019-03-27 09:08:33.786 Service
1 DOWN
2019-03-27 09:09:22.265 Service1 UP
2019-03-27 09:14:10.797 Service
4 DOWN
2019-03-27 09:15:39.382 Service3 UP
2019-03-27 09:14:16.802 Service
4 UP

0 Karma