Monitoring Splunk

How to troubleshoot why Splunk performance suddenly decreased and we now see error "The splunkd daemon cannot be reached by splunkweb"?

MollyDS
Explorer

Splunk has been running extremely slow since this morning around 10 AM ET. When I came in this morning, it was running fine with normal speeds, but around 10ish it just slowed down significantly. Just recently around 2ish started receiving:

 The splunkd daemon cannot be reached by splunkweb. Check that there are no blocked network ports or that splunkd is still running.

when I tried to go into some of the system settings.

Splunk has run extremely well for the past week or so (since we started the trial), with no changes being made to it. The only thing I was doing on Splunk this morning was trying to figure out how to use drop-downs on the dashboards by following the Splunk documentation.

What could be the reasons as to why it's not running well?

sjohnson_splunk
Splunk Employee
Splunk Employee

If you are running on Linux how many file handles are available to splunk? If you look at the beginning of the splunkd.log file for this event:

04-07-2016 12:48:03.894 -0400 INFO ulimit - Limit: open files: 10240 files [hard maximum: unlimited]

If your number is the default value (1024) you will need to increase that number to something like 16000.

MollyDS
Explorer

I'm not running on linux but windows. Would windows look similar to that? Because I did find this:

Maximum number of threads (approximate): 16361

I don't know how similar threads is to files though.

0 Karma

ryanoconnor
Builder

There's definitely a lot to check when it comes to performance but I'll start with a few basic questions since we don't know much about your environment:

Is this a single instance of Splunk or distributed?
What Apps do you have installed?
How much data per day are you ingesting?
How many users do you have using the system?

Also the wiki is pretty comprehensive for performance issues: https://wiki.splunk.com/Community:PerformanceTroubleshooting

0 Karma

MollyDS
Explorer
  1. We have one instance of splunk, with one forwarder pushing in logs from another server
  2. Apps Installed: Alert Manager, Calendar Heat Map, Slideshow, and Timeline
  3. We are ingesting about 200-300 MB per day, 100-200 on the weekends though
  4. There are 2 admin users, but only one account is currently being used

I've been using it the same for about a week, except today I was messing around with drop downs in the dashboard, left it for about 30 mins for a meeting and then I came back to it having issues.

I'll try reading up on the Community link you gave me.

ryanoconnor
Builder

Alright ya see if that helps. I understand it can be dense. What does your hardware look like just out of curiosity? What is the CPU/Memory?

0 Karma

jkat54
SplunkTrust
SplunkTrust

sometimes this happens when you've scheduled too many searches. sometimes it happens when you have very long running searches scheduled. sometimes your dispatch directory fills with search bundles.

I recommend stopping splunk and starting it again. If the behavior is good for a moment, then returns to bad... most likely you have lots of searches scheduled in background, etc. restart and watch the activity / job monitor

0 Karma

MollyDS
Explorer

Oh sorry I forgot to mention that I already tried that, and it didn't improve at all . I'll try it once more and see if it will improve this time. Thanks I'll try cleaning out the searches a bit more and seeing what scheduled searches I have.

0 Karma
Take the 2021 Splunk Career Survey

Help us learn about how Splunk has
impacted your career by taking the 2021 Splunk Career Survey.

Earn $50 in Amazon cash!