I have about 2658 devices checking into our deployment server (CentOS 6.6, x64, Splunk 6.41)
Over all we sit around 10-20% CPU with plenty of memory free. But the over all UI performance is becoming basically un-usable. I am guessing there are some performance tweaks I need to make. Really havn't seen any guides to this.
Probably worth mentioning 99% of the clients have a 2 hour check-in time. But about 20 servers (other Splunk servers) are set to every 2 minutes.
top - 19:09:45 up 319 days, 1:44, 2 users, load average: 0.72, 0.63, 0.40 Tasks: 235 total, 2 running, 233 sleeping, 0 stopped, 0 zombie Cpu(s): 20.9%us, 7.8%sy, 0.0%ni, 69.0%id, 1.7%wa, 0.0%hi, 0.4%si, 0.0%st Mem: 16333660k total, 15423556k used, 910104k free, 191868k buffers Swap: 8388604k total, 375048k used, 8013556k free, 8733040k cached
I ended up creating a "read only" replacement of the Fowarder Management interface using REST calls as this responds in a reasonable time (around 30 seconds). To speed it up further I created a scheduled search running every 5 minutes that creates a lookup, then it used for the queries.
Have been progessing this issue with Support and received a glimmer of hope today:
They've located the relevant codes which contributed to the issue and currently
discussing on possible fix since some of the change will impact both front end and backend.
Heres hoping a fix is on the way!
Just received confirmation from Splunk support that a partial fix for the Deployment Server UI performance will be made available in 7.1.2 or 7.0.5 (next release). From the testing performed this should decrease the loading time by about 50% on very large instances.
For an environment I work in however we are sitting on about a 6 minute load time... so if this drops to 3 minutes it's a massive improvement.. but 3 minutes is still unacceptable in my books.
I've followed this up to have an Enhanced Request (ER) logged as suggested by Support, as further fixes will apparently require a more indepth code review/change.
So hopefully in the near future things may be "better". I'll report our findings once it gets released.
I have a deployment server with triple number of deployment clients. Same Symptoms, server is bored (heavily underutilized), GUI is unusable (huge delays after each click).
I observed that the browser (I used on my laptop for administering the deployment-server WEB-GUI) consumes all the RAM on my laptop.
I have never seen a browser process before consuming more than 2 GB of RAM. The effect is independent from the browser used.
Maybe splunk helps us by improving the resource consumption in the browser (which implies a redesign of the GUI).
WHat browser are you using? Using Chrome, and 6.4+, i've been with in engagements with 4000+ clients and the GUI responsive and quite smooth. What particular area(s) are you seeing slowness, or "all" areas? Id say there could be some slowness in the forwarder listing, but this shouldnt cascade across the whole deployment...