Getting Data In

How to get the duration between different events of the same asset?

Randall2020
Engager

Hi, I would like to find out how to calculate the time difference between different events of the same asset ID (group by). My data is structured as such (see below) with no transaction IDs provided:

            Asset_ID          _time                                                  Event_Status

  1. A001                  2021-01-01 00:00:00                 A
  2. A001                  2021-01-01 00:01:00                 B
  3. A001                  2021-01-01 00:07:00                 A
  4. A002                  2021-01-01 00:01:00                 B
  5. A002                  2021-01-01 00:02:00                 C
  6. A002                  2021-01-01 00:09:00                 A
  7. A002                  2021-01-01 00:11:00                 D
  8. A003                  2021-01-01 00:00:00                 B
  9. A003                  2021-01-01 00:09:00                 D
  10. ...

Note:

  1. the event statuses can appear in any order
  2. time duration needs to be grouped by common asset ID for it to be meaningful
  3. It's intended that this be deployed in a live system, so event status values that aren't closed (last value for an asset) are to display the current elapsed time since that event occurred

The desired output would be to compute the time duration between rows as such:

  1. 1 & 2 for Asset A001: 1min
  2. 2 & 3 for Asset A001: 6min
  3. 3 for Asset A001: running for X duration (based on current time) since 2021-01-01 00:07:00
  4. 4 & 5 for Asset A002: 1min
  5. 5 & 6 for Asset A002: 7min
  6. 6 & 7 for Asset A002: 2min
  7. 7 for Asset A002: running for X duration (based on current time) since 2021-01-01 00:11:00
  8. 8 & 9 for Asset A003: 9min
  9. 9 for Asset A003: running for X duration (based on current time) since 2021-01-01 00:09:00
  10. ...

I've previously tried experimenting using the "transaction" and "duration" functions but they don't seem to give the desired result. Any suggestions on how to resolve this would be greatly appreciated. Thanks.

Labels (3)
Tags (2)
0 Karma
1 Solution

bowesmana
Champion

One solution is like this

| makeresults
| eval _raw="Asset_ID _time Event_Status
A001    2021-01-01:00:00:00 A
A001    2021-01-01:00:01:00 B
A001    2021-01-01:00:07:00 A
A002    2021-01-01:00:01:00 B
A002    2021-01-01:00:02:00 C
A002    2021-01-01:00:09:00 A
A002    2021-01-01:00:11:00 D
A003    2021-01-01:00:00:00 B
A003    2021-01-01:00:09:00 D"
| multikv forceheader=1
| eval _time=strptime(time, "%F:%T")
| table Asset_ID _time Event_Status
| sort - _time
| streamstats window=2 global=false range(_time) as duration by Asset_ID
| eval duration=if(duration=0, now()-_time, duration)

From the sort command. Using streamstats to calculate the time between this and previous event and then if duration is 0 (because it's the last event for an Asset_ID) it just calculates current time - event time.

Note that this requires the event order to be in latest_first order as the streamstats calculates the duration for the current event as this event time - earlier event time.

If your data is already in that order you will not need to sort - typically it will be.

The important global=false means that there are separate ranges calculated using the by clause, so you do not need to ensure the events are coming in for the same Asset_ID order.

Note that also if a duration is 0 between any two events, then it assumes it's the last event, so will just calculate now() - _time as the duration. If your events can really have a duration of 0, then you will need to find a solution for that.

 

View solution in original post

Randall2020
Engager

Thanks! That was really helpful.

0 Karma

bowesmana
Champion

One solution is like this

| makeresults
| eval _raw="Asset_ID _time Event_Status
A001    2021-01-01:00:00:00 A
A001    2021-01-01:00:01:00 B
A001    2021-01-01:00:07:00 A
A002    2021-01-01:00:01:00 B
A002    2021-01-01:00:02:00 C
A002    2021-01-01:00:09:00 A
A002    2021-01-01:00:11:00 D
A003    2021-01-01:00:00:00 B
A003    2021-01-01:00:09:00 D"
| multikv forceheader=1
| eval _time=strptime(time, "%F:%T")
| table Asset_ID _time Event_Status
| sort - _time
| streamstats window=2 global=false range(_time) as duration by Asset_ID
| eval duration=if(duration=0, now()-_time, duration)

From the sort command. Using streamstats to calculate the time between this and previous event and then if duration is 0 (because it's the last event for an Asset_ID) it just calculates current time - event time.

Note that this requires the event order to be in latest_first order as the streamstats calculates the duration for the current event as this event time - earlier event time.

If your data is already in that order you will not need to sort - typically it will be.

The important global=false means that there are separate ranges calculated using the by clause, so you do not need to ensure the events are coming in for the same Asset_ID order.

Note that also if a duration is 0 between any two events, then it assumes it's the last event, so will just calculate now() - _time as the duration. If your events can really have a duration of 0, then you will need to find a solution for that.

 

View solution in original post

Register for .conf21 Now! Go Vegas or Go Virtual!

How will you .conf21? You decide! Go in-person in Las Vegas, 10/18-10/21, or go online with .conf21 Virtual, 10/19-10/20.