We have a distributed environment with a total of two indexers.
These run on 12-core machines.
After upgrading to Splunk 4.3.3 (build 128297) the indexers have started acting rather strangely.
After running for a while searches never finish.
When I check the indexers, the splunkd process seems to be using only one CPU core, which (naturally) is pegged at 100%.
Normally a restart of the splunk processes will clear the problem for a while (anywhere from a few hour to a couple of days).
So far I have not found anything that seems relevant in the logs.
Does anyone have any suggestions?
Saw the same on our Splunk server (RHEL 6).
Fixed it via a hint on this article:
Step 3 and 4 can also be executed with the script mentioned in the article.
Tried the suggestion above, and it looked like that worked for a couple of days.
But now the indexers are back to the same.
They seem to be using only one core, and forwarder connections are being dropped.
Which of course means that data is not being indexed.
I really have no idea what to search for in the splunk logs, all the normal gotchas seems to be ok.