Creating stats count based on a sequence of events within a timeframe. For example, count the unique sessions, within a 6-hour timeframe, that resulted in 1- Failures without Success, 2- Success, or 3- Failures followed by Success:
SessionID Time Action
Abcd 12:03:11 Failure
Abcd 12:04:19 Failure
m 12:05:49 Failure
XXXXX 12:06:20 Failure
XXXXX 12:07:34 Failure
Abcd 12:10:11 Failure
Abcd 12:23:12 Success
ZZ 12:28:10 Success
XXXXX 12:31:00 Failure
Abcd 21:03:11 Success
m 22:03:11 Failure
m 22:03:12 Success
Produces:
Failure_no_success | Success | Failure_then_success
2 | 2 | 2
Where Failure_no_success is the three XXXXX and the first m sessions, Success is the ZZ session and the last Abcd session, and Failure_then_success is the four Abcd and the last two m sessions.
There are multiple inefficient ways to solve this, like combining many subsearches, outputing some of the data to a lookup table and reading it back, etc. But is there a "correct" and scalable way to perform this count?
... View more