Hi everyone, I have the following event format:
_time logindate logoutdate userid
2019-07-25 09:41:21 25/07/2019 09:41:21 25/07/2019 13:47:52 USER1
2019-07-25 09:41:02 25/07/2019 09:41:02 25/07/2019 11:43:17 USER2
2019-07-25 09:39:56 25/07/2019 09:39:56 25/07/2019 13:01:17 USER4
2019-07-25 09:39:45 25/07/2019 09:39:45 25/07/2019 11:39:58 USER3
2019-07-25 09:39:15 25/07/2019 09:39:15 25/07/2019 10:32:34 USER2
2019-07-25 09:38:04 25/07/2019 09:38:04 25/07/2019 11:39:07 USER1
logindate and _time have the same value, because splunk considered the logindate field as the event _time automatically.
What I need to accomplish is to count distinctively the number of users that were logged in at the same time. I have studied the concurrency command, but I don't think it solves my problem since I need to count distinct users.
I was able to solve this in SQL (where these values are acctualy stored) by creating an auxliary table with just timestamps ranging from the earliest logindate to the latest logout date, incremented by the hour and then inner joining that table to data table whenever the timestamp from the dummy table was between logindate and logoutdate.
Could I accomplish something similar or better in Splunk?
I was able to solve this by creating a range of dates between logindate
and logoutdate
and turning that field into the new _time
field:
mysearch
| eval logindate = strptime(logindate, "%d/%m/%Y %H:%M:%S") | bin span=1m logindate
| eval logoutdate = strptime(logoutdate, "%d/%m/%Y %H:%M:%S") | bin span=1m logoutdate
| eval range = mvrange(logindate, logoutdate, "1m")
| field - _time
| rename range as _time
| timechart span=5m limit=10 otherstr=Outros distinct_count(userid)
First I had to convert the fields logindate
and logoutdate
to proper datetime, and then I used the bin
command just to snap logindate
and logoutdate
back to their earliest minute, but that's optional.
I used mvrange
to create a multivalued field of dates wich range between logindate
and logoutdate
every 1 minute. Then I got rid of the original _time
field and replaced it with the range
field.
After that, a simple timechart
will do the trick.
I was able to solve this by creating a range of dates between logindate
and logoutdate
and turning that field into the new _time
field:
mysearch
| eval logindate = strptime(logindate, "%d/%m/%Y %H:%M:%S") | bin span=1m logindate
| eval logoutdate = strptime(logoutdate, "%d/%m/%Y %H:%M:%S") | bin span=1m logoutdate
| eval range = mvrange(logindate, logoutdate, "1m")
| field - _time
| rename range as _time
| timechart span=5m limit=10 otherstr=Outros distinct_count(userid)
First I had to convert the fields logindate
and logoutdate
to proper datetime, and then I used the bin
command just to snap logindate
and logoutdate
back to their earliest minute, but that's optional.
I used mvrange
to create a multivalued field of dates wich range between logindate
and logoutdate
every 1 minute. Then I got rid of the original _time
field and replaced it with the range
field.
After that, a simple timechart
will do the trick.
This absolutely is a use-case for the concurrency
command; see my answer here for everything that you need:
https://answers.splunk.com/answers/319585/how-to-graph-the-number-of-active-sessions-over-ti.html
I was looking at the answer you posted, and noticed a main difference: a already have the logindate
and logoutdate
fields in every event, so could I replace everything from the beginning up until before the table
command with just eval duration = logoutdate - logindate
?
Also, how does appending the gentimes
result to the table logindate, duration
garantees that I will be able to count distinct users?
This log is from an Application which is licensed by Concurrent Users and the same user logged in several times during the same period does not count as multiple concurrent user.
Yes, you are correct about duration
. The gentimes
sets the timeframe inside of which to count concurency
(e.g. hourly concurrent users). However you are correct; there is nothing that would allow you to count distinctly
so this is not a complete answer.
Have you looked at streamstats?