after the upgrade to Splunk 6 from Splunk 4.3.3, we have serious problems with our single-server instance (Windows Server 2008 R2). In fact during the night the CPU usage of "splunkd" processes reaches 100% and hangs at this level, making the server unreachable via WEB interface, but often also via RDP connections. The only solution is to terminate the splunkd process and then restart it.
Having a look at splunkd.log, we see a continuous flooding of these errors (over 35.000 errors per second):
ERROR TcpChannel - Error trying to begin socket accept: An invalid argument was supplied.
These errors disappear after splunkd service has been restarted.
We guess there is some problem with inbound connections from Universal Forwarders (that have not been upgraded yet), but we have no clue to confirm this diagnosis. System event logs do not report any warning or error.
Any suggestions would be really appreciated.
I have the same problem. I am using splunk on a 12 cpu and 8 Go RAM system. when i run top command i notice 233 % of CPU use by splunkd.
Please i need more information about this process.
How splunkd works in a multi processor system ?
Unluckily I have now way to identify the error that occurred immediately before the TcpChannel messages, because these are generated too fast and the splunkd.log files are flooded with them. The only error we can see in the splunkd.log and its associated rotation files is the TcpChannell message, repeated indefinitely.