I have several events that are structured like this:
2020-09-28T15:18:40Z duration=8.0 somevalue=42 otherfield=A
2020-09-28T15:18:45Z duration=2.0 somevalue=10 otherfield=B
2020-09-28T15:18:44Z duration=2.0 somevalue=10 otherfield=B
Here "duration" is in seconds.
I would like to transform those events to a kind of timechart by spreading the "somevalue" on the "duration" starting from the "_time" of the event. The span would be 1second for example.
Another condition, is to be able to aggregate on "otherfield".
I'm expecting something like this:
_time | sum(somevalue) otherfield=A | sum(somevalue) otherfield=B |
2020-09-28T15:18:40Z | 42 | 0 |
2020-09-28T15:18:41Z | 42 | 0 |
2020-09-28T15:18:42Z | 42 | 0 |
2020-09-28T15:18:43Z | 42 | 0 |
2020-09-28T15:18:44Z | 42 | 10 |
2020-09-28T15:18:45Z | 42 | 20 |
2020-09-28T15:18:46Z | 42 | 10 |
2020-09-28T15:18:47Z | 42 | 0 |
2020-09-28T15:18:48Z | 42 | 0 |
2020-09-28T15:18:49Z | 0 | 0 |
I tried to use the "concurrency" function but was not able to get the values spread on several intervals (only a value for the _time of the event).
I came up with a solution to a similar problem here
In essence, what I did was duplicate each event for each time unit the original event was active (days in the original question, but could easily be seconds in your case). This was done using the makecontinuous command, having first set up a field with the running total of events needed. Then the _time field is adjusted for each new event. This gives a basis on which to do stats.
I noticed that for fieldA in your example you have a duration of 8 seconds but 9 events, whereas for the two fieldB examples you have duration of 2 seconds and 2 events each. Which model do you want to use?
It is a complicated approach, if you have any questions, or need help setting this up for your example, let me know (although I am quite busy so may be a little while to work it through).
Thank you for the proposition @ITWhisperer
The 9 repeated events was a mistake, it should be 8.
I'm really confused about the goal of "makecontinuous" role and how to use it. I would appreciate if you can give me an example of the query that you think will work.
From your data
2020-09-28T15:18:40Z duration=8.0 somevalue=42 otherfield=A
2020-09-28T15:18:45Z duration=2.0 somevalue=10 otherfield=B
2020-09-28T15:18:44Z duration=2.0 somevalue=10 otherfield=B
We would add a column for the number of rows needed
Time | Duration | somevalue | otherfield | rows |
2020-09-28T15:18:40Z | 8.0 | 42 | A | 8 |
2020-09-28T15:18:45Z | 2.0 | 10 | B | 10 |
2020-09-28T15:18:44Z | 2.0 | 10 | B | 12 |
Then we put a row at the start
Time | Duration | somevalue | otherfield | rows |
1 | ||||
2020-09-28T15:18:40Z | 8.0 | 42 | A | 8 |
2020-09-28T15:18:45Z | 2.0 | 10 | B | 10 |
2020-09-28T15:18:44Z | 2.0 | 10 | B | 12 |
makecontinuous when applied to field rows expands this to
Time | Duration | somevalue | otherfield | rows |
1 | ||||
2 | ||||
3 | ||||
4 | ||||
5 | ||||
6 | ||||
7 | ||||
2020-09-28T15:18:40Z | 8.0 | 42 | A | 8 |
9 | ||||
2020-09-28T15:18:45Z | 2.0 | 10 | B | 10 |
11 | ||||
2020-09-28T15:18:44Z | 2.0 | 10 | B | 12 |
We can then copy the relevant data up (well down after we sort descending on rows)
We also adjust the time field so that in increases over the relevant duration
Time | Duration | somevalue | otherfield | rows |
2020-09-28T15:18:40Z | 8.0 | 42 | A | 1 |
2020-09-28T15:18:41Z | 8.0 | 42 | A | 2 |
2020-09-28T15:18:42Z | 8.0 | 42 | A | 3 |
2020-09-28T15:18:43Z | 8.0 | 42 | A | 4 |
2020-09-28T15:18:44Z | 8.0 | 42 | A | 5 |
2020-09-28T15:18:45Z | 8.0 | 42 | A | 6 |
2020-09-28T15:18:46Z | 8.0 | 42 | A | 7 |
2020-09-28T15:18:47Z | 8.0 | 42 | A | 8 |
2020-09-28T15:18:45Z | 2.0 | 10 | B | 9 |
2020-09-28T15:18:46Z | 2.0 | 10 | B | 10 |
2020-09-28T15:18:44Z | 2.0 | 10 | B | 11 |
2020-09-28T15:18:45Z | 2.0 | 10 | B | 12 |
You now have the data you need to count by time, somevalue, otherfield