Re: Conccurent users using splunk web

shobbit · ‎05-02-2014

Hi - fairly new to Splunk an have a specific report my customer wants to monitor/report on. They want to understand how many people are using Splunk over time. This will also us to size Splunk web deployment in the future. I have the S.o.S app installed but none of the dashboards are quite right from S.o.S or the default Splunk activity dashboards to give me user concurrency.

I am trying to construct something that shows number of concurrent users that have/are logged into splunk web. i,e.

8AM 1user

9am 3users

10am 2users
etc

davidpaper · ‎05-04-2014

You've hit on a touchy problem with Splunk: figuring out how busy the infrastructure is at any point in time. There are two things to look at.

1) How many users are currently using Splunk.

This is interesting, but only goes so far. Am I "currently using Splunk" if I have a static dashboard on my screen that has finished loading 10 minutes ago, and I'm either staring at it, or have my head turned talking to someone else? martin_mueller's search in his comment is spot on, and will help you answer this question. 1 hour may be too long of a time frame, as I have found 1m or 5m is more useful for determining how busy Splunk is.

2) How many searches are currently being run.

This is a little harder, because searches come and go, sometimes fairly quickly. A couple of ways to see this info. First, concurrent searches by user. Who's exercising Splunk the most?

index=_internal source=*metrics.log group="search_concurrency" NOT "system total"
| timechart span=1m sum(active_hist_searches) as concurrent_searches by user

Interesting patterns emerge per person/group and time of day.

Second, is this ad-hoc or scheduled? Too many concurrent scheduled searches can really bring Splunk to its knees. A lot of scheduled searches may be okay, if they are very short duration (like populating summary indexes or report acceleration).

`set_sos_index` sourcetype=ps 
 | multikv
 | `get_splunk_process_type`
 | search type="searches"
 | rex field=ARGS "_--user=(?<search_user>.*?)_--"
 | rex field=ARGS "--id=(?<sid>.*?)_--"
 | rex field=sid "remote_(?<search_head>[^_]*?)_"
 | eval is_remote=if(like(sid,"%remote%"),"remote","local")
 | eval is_scheduled=if(like(sid,"%scheduler_%"),"scheduled","ad-hoc")
 | eval is_realtime=if(like(sid,"%rt_%"),"real-time","historical")
 | eval  is_subsearch=if(like(sid,"%subsearch_%"),"subsearch","generic")
 | eval search_type=is_remote.", ".is_scheduled.", ".is_realtime
 | timechart span=1m dc(sid) AS "Search count" by is_scheduled

Props go out to hexx (SoS guru) for these, and hopefully they (or something like it) will show up in SoS in the near future.

martin_mueller · ‎05-06-2014

As a different approach, you could run this:

| pivot internal_audit_logs searches sum(total_run_time) AS run_time SPLITROW _time PERIOD hour SORT 0 _time | eval avg_cpus = run_time / 3600 | timechart span=1d max(avg_cpus) as max_cpus_per_hour

That will calculate the total seconds spent on searching for every hour, convert that into average number of searches running concurrently during that hour, and use the worst hour each day for charting.

shobbit · ‎05-06-2014

ignore that last comment I figured it out - it needs to run in S.o.S app not default search 🙂

shobbit · ‎05-06-2014

BTW david is your large sos search supposed to work as written? my instance doesn't seem to like the set_sos_indexor get_splunk_process_type bits? PS I've not worked with searches of this complexity yet so excuse my ignorance

davidpaper · ‎05-06-2014

That's a splunk docs typo, in my opinion. "1 active user per core (idealy) two". If a user isn't actively searching, then they shouldn't count! 🙂

shobbit · ‎05-06-2014

Hi davidpaper,

You've hit the nail on the head. However unfortunately I don't actually care about busy, I have enough monitoring elsewhere to figure out busy and cause thereof, just concurrency at this point. sigh

It's an interesting conundrum because one of the sizing factors Splunk recommend is 1 user per core (ideally 2) hence concurrency would seem to be a useful measure in sizing...

thanks for pointers so far everyone
PS - As MuS rightly surmised. I am using LDAP, forgot to mention that bit!

martin_mueller · ‎05-02-2014

You might get away with a simple search something like this:

index=_internal sourcetype=splunk_web_access user=* NOT user="-" | timechart span=1h dc(user)

MuS · ‎05-02-2014

Hi shobbit,

it depends how Splunk handles user authentication.

If you're using LDAP based users and SSO for authentication, user logins are not handled by Splunk and therefore you will not find any of the SSO / LDAP user logins in the audit.log.

But you can use the REST end point /services/authenticaion/httpauth-tokens on your search head like this

| rest /services/authentication/httpauth-tokens splunk_server=local

and you will get a list of users which were or still are connect over SSO / LDAP.

Setting this up as saved search with summary indexing will give you the abillity to gether historical events as well.

If you're using Splunk internal user authentication, you will find the needed information inside Splunk's audit.log. You can search for it like this:

index=_audit action="login attempt" | ...

hope this helps...

cheers,
MuS

Conccurent users using splunk web

Celebrating the Winners of the ‘Splunk Build-a-thon’ Hackathon!

Why You Should Register for Splunk University at .conf25

Building Splunk proficiency is a marathon, not a sprint

Are you a member of the Splunk Community?

Conccurent users using splunk web

Celebrating the Winners of the ‘Splunk Build-a-thon’ Hackathon!

Why You Should Register for Splunk University at .conf25

Building Splunk proficiency is a marathon, not a sprint