Splunk Search

Set implied transaction start and end time if start or end events are missing

mahesh_ravji1
Explorer

Hi There,

We have some user activity logs with LOG_ON and LOG_OFF events in Splunk similar to following:

2014/07/13 13:00:00 User=UserA SessionId=cfe66c82-e983-4d4c-b654-84555b94c7c8 Event=LOG_OFF

2014/07/13 14:30:00 User=UserA SessionId=00ba16c5-928d-4845-88fd-dceb6f1eae8c Event=LOG_ON

2014/07/13 15:00:00 User=UserA SessionId=00ba16c5-928d-4845-88fd-dceb6f1eae8c Event=LOG_OFF

2014/07/13 20:00:00 User=UserA SessionId=378f9fb2-c21d-4ee5-b6c1-fa54d46eff3e Event=LOG_ON

I would like to calculate the total logged on time per day based on above data (i.e. time between LOG_ON and LOG_OFF events). The transaction command appears to be perfect for this requirement.

search | transaction SessionId startswith=(Event="LOG_ON") endswith=(Event="LOG_OFF") | table _time Event User SessionId duration

_time           Event   User        SessionId               duration

2014-07-13 13:00:00 LOG_OFF User=UserA  cfe66c82-e983-4d4c-b654-84555b94c7c8    0

2014-07-13 14:30:00 LOG_OFF User=UserA  00ba16c5-928d-4845-88fd-dceb6f1eae8c    1800.00

            LOG_ON
2014-07-13 20:00:00 LOG_ON  User=UserA  378f9fb2-c21d-4ee5-b6c1-fa54d46eff3e    0

However if the first event of a transaction is a LOG_OFF, I would like to assume start of transaction LOG_ON was at midnight 2014-07-13 00:00:00. Equally if the last event of a transaction is a LOG_ON, I would like to assume end of transaction LOG_OFF was at next midnight 2014-07-14 00:00:00.
So for the above data the total logged on time should be 13 hours + 30 mins + 4 hours = 17.5 hours.

Any help on how this can be achieved is most appreciated.

Thanks

Tags (1)

emiller42
Motivator

Put the following after your transaction:

| eval duration=if(eventcount==1, case(Event=="LOG_OFF", _time-relative_time(_time, "@d"), Event=="LOG_ON", relative_time(_time, "+1d@d")-_time), duration)

This does the following:

  • Checks if eventcount==1. If not, it uses the existing duration value.
  • If Event=="LOG_OFF" then it calculates the difference between the timestamp and the start of the day.
  • If Event=="LOG_ON" then it calculates the difference between the timestamp and the start of the next day.

That will allow transactions with a proper start/end to keep their transaction generated durations, while incomplete transactions will show durations according to your conditions. Then you should be able to do stats as expected.

emiller42
Motivator

Glad I could help! Can you mark the answer as accepted? (Click the checkmark) That way anyone finding this in search later knows it has a working solution.

Thanks!

0 Karma

mahesh_ravji1
Explorer

Thanks your answer solved my problem. I ended up using a slightly modified query as follows:

| eval duration=case(duration>0, duration, Event=="LOG_OFF", _time-relative_time(_time, "@d"), Event=="LOG_ON", relative_time(_time, "+1d@d")-_time)

0 Karma

datasearchninja
Communicator

How about this:

| eval startofday=strptime(strftime(_time, "%Y/%m/%d 00:00:00"), "%Y/%m/%d %H:%M:%S") | eval endofday=strptime(strftime(_time, "%Y/%m/%d 23:59:59"), "%Y/%m/%d %H:%M:%S") | eval new_duration=case(duration>0,duration,Event="LOG_ON",endofday-_time,Event="LOG_OFF",_time-startofday)

mahesh_ravji1
Explorer

Thanks your answer also helped. I combined your answer with the other answer and came up with:
| eval duration=case(duration>0, duration, Event=="LOG_OFF", _time-relative_time(_time, "@d"), Event=="LOG_ON", relative_time(_time, "+1d@d")-_time)

0 Karma