Splunk Search

How to graph a unique count of users logged on by hour from login and log out information

Splunkster45
Communicator

I have a set of log entries that looks like the following:

2014/10/20 12:23:30 [28761-9098]: Session 9098 (username@ipaddress) started
2014/10/20 14:33:33 [28761-9098]: Session 9098 ended

I would like to be able to create a graph that shows how many people are logged in for a particular hour. Currently, I'm able to create a graph that shows the hour in which people have logged on. In the above time example, my current graph would show that the user logged on during the 12:00 hour, but would not show anything for the 13:00 or 14:00 hour.

Here's what my code looks like:

... "Login succeeded for user" | rex field=_raw ".*Login succeeded for user: (?<user>.*)"    | stats dc(user) as "unique logins" by date_hour

I'm familiar with the transaction command and am able to get the duration for which people have logged on with the below code, but I don't know how to apply that to a graph to show a unique count of people over each hour

... "Session" | rex field=_raw "Session  (?<number>>>doublebackslash>w+) (<doublebackslash>((?<user><doublebackslash>w+)@|)" | transaction number startswith "started" endswith "ended" | where duration > 1 |

Does anyone have any advice? Thanks in advance!

Tags (3)
0 Karma

Splunkster45
Communicator

I've done some further looking into this and found a thread that accomplishes exactly what I'm trying to accomplish! link

I've have two questions over this. First of all, I can't seem to get past the regex part. Here's my version of the code:

  "Session" | rex field=_raw "Session (?<id>\\w+) (\\((?<user>\\w+)@|)" | eval mytime=_time | transaction id startswith "started" endswith "ended" | where duration > 1
| eval transactionid=id._time 
| stats min(mytime) AS start max(mytime) AS stop values(id) AS id values(duration) AS duration by transactionid
| eval mytimeconcat="1_".start." -1_".stop
| eval mytimemv=split(mytimeconcat," ") 
| mvexpand mytimemv  
| rex field=mytimemv "(?(1|-1))_(?<_time>d+)" 

The error message that I get when I run this is as follows: "Error in 'rex' command: Encountered the following error while compiling the regex '(?(1|-1))_(?<_time>d+)': Regex: malformed number or name after (?("

I think that I'm getting this error message because I already have a rex command in the system, but I'm not sure. Secondly, in the command | table _time id counter where does counter come from? It seems that it just appears out of nowhere

Thanks!

0 Karma

emiller42
Motivator

your second rex looks off. Can you explain what it's trying to do?

"(?(1|-1))_(?<_time>d+)" 

That first ? makes me think you're trying to extract either 1 or -1 as a field, but you're not giving it a name. Then the (1|-1) doesn't do what you think, (It's saying "a 1 or a -, followed by a 1")

If that's your intent, the regex should be:

"(?<field_name>1|(-1))_(?<_time>d+)"
0 Karma

emiller42
Motivator

The concurrency command can be helpful here. If given a start time and duration field (defaults: _time, duration) it calculates overlaps. Assuming users can't have more than one session active at a time, this would give you concurrent users over time:

... | stats count earliest(_time) as _time latest(_time) as logoff first(user) as user by  number 
    | eval duration=logoff-_time 
    | eval duration=if(count==1, now()-_time, duration)
    | concurrency output=concurrent_users 
    | timechart avg(concurrent_users)

The above does the following:

  • in this case, we can use stats like a cheap transaction, assuming session number is unique for each login session.
  • We calculate the session duration based on earliest/latest times
  • To accommodate users currently logged in, we can further define duration to be now()-_time if there is only one event. (A login but no logoff)
  • Now we use concurrency to generate a concurrent sessions metric.
  • Finally, timechart the concurrent_users field we created in the previous step.
0 Karma

Splunkster45
Communicator

The session number is unique for each login session.
I implemented this (after inserting duration=duration) into the concurrency segment and got a cool graph.

I thought that this was really nifty, however, like you said, this is a graph of the total number of users that have logged in. When I displayed the graph for the past week, I got back a (strictly) increasing function. It doesn't appear that the graph took into account when people logged off. I'm looking for the number of concurrent users per hour. If there was someway to subtract when people logged off, then I think that this would be close. Also, people can and will have multiple open sessions at once and so if one person is logged in twice in at 10, I'll need to only record the value once. However, I've learning to use concurrency and so this I've been able to take away some things 🙂

0 Karma

Splunkster45
Communicator

I think that one reason this may be the case is that _time is in mm/dd/yyyy format while logoff is in epoch time. This messes with the duration command that tries to subtract the two.

0 Karma

emiller42
Motivator

That shouldn't be the case _time is actually stored in epoch time. You can test this with the following:

index=* | head 1 | eval test=1414675868 | eval now=now() | eval diff=now-test| table now test diff | fieldformat now_frmt=strftime(now, "%c") | fieldformat test_frmt=strftime(test, "%c")

If you subtract two epoch times in splunk, the difference is in seconds, which is the scale that concurrency expects.

For me the output of that is:
now: 1414676209
test: 1414675868
diff: 341
now_frmt: Thu Oct 30 08:36:49 2014
test_frmt: Thu Oct 30 08:31:08 2014

0 Karma

emiller42
Motivator

Concurrency should take care of this. It takes the start time and the duration for each event, and then calculates overlaps. This means that as sessions end, they get dropped out of the concurrency total. The only thing I can think of is the duration is getting calculated at a different scale than the concurrency command expects. (IE Duration is in milliseconds but concurrency assumes seconds) This would make the sessions seem to go on longer than expected and appear to just keep growing over time.

0 Karma

gkanapathy
Splunk Employee
Splunk Employee

Actually you don't need to pair transactions or sessions if you just want the total number in a given time period:

... | rex "Session (?<s_sid>\d+) \(\@\) started" | rex "Session (?<e_sid>\d+) ended"
    | timechart span=1h count(s_sid) AS logins count(e_sid) AS logouts
    | eval net_logins=logins-logouts
    | streamstats global=t current=t sum(net_logins) as cumulative_net_users
    | timechart span=1h sum(cumulative_net_users)
0 Karma

Splunkster45
Communicator

I've got a work around for the first issue that I just listed - logs that have ended, but have not started (during the specified time frame). If I make a clever use of the time frame to go back to a point in time where no one is on, then I won't have this issue. E.g. snap to midnight of a particular day. However, I've run our program over night and so this won't take care of all instances, but its a start.

0 Karma

Splunkster45
Communicator

This was helpful! But there are a few things that I need to iron out.

I modified my opening post to show what the logs exactly look like. Before, I think I put a wildcard (or something similar) in place of the username and ipaddress as it wasn't important, but that messed things up. One issue I'm having with this is that if I go back 48 hours, then I get logs that ended, but never started (as their start date was over 48 hours ago). If there are three logs that fit this scenario, then my graph down by 3.

I also have to figure out how to make the logins unique. If one person is logged in twice in an hour, then I don't want to count him twice. I can kind of map this out in my head (with a bash script) and it doesn't look like this will be simple. I wouldn't mind being wrong though.

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...