A subset of our remote event log collections have stopped spontaneously.
We have a total of 70 remote event logs being monitored, however approx 25% of them have stopped.
Most of these stopped on the same day, and some stopped a couple days later. They are using the same index.
I have disabled collection of one device and then re-enabled it and that has restored the logging on that device.
As soon as I track down who ordered Splunk I am planning for a support call, but I wanted to check and see if this is a known issue or if I am doing something wrong while I sort out the support login.
Hhmm.. checking out the splunk log, it seems that log collection is timing out during maintenance windows/downtime and once timed out it doesn't attempt to collect logs from that server until i start/stop.. is that expected behavior? -> looks like the backoff time is what we would want to adjust
10-15-2017 17:48:43.645 -0700 ERROR ExecProcessor - message from "F:\Splunk\bin\splunk-wmi.exe" WMI - Unable to connect to WMI namespace "\XXXX.dmz\root\cimv2" (attempt to connect took 998 microseconds) (error="The RPC server is unavailable." HRESULT=800706BA)
10-15-2017 17:48:43.645 -0700 ERROR ExecProcessor - message from "F:\Splunk\bin\splunk-wmi.exe" WMI - Giving up attempt to connect to WMI provider after maximum number of retries at maximum backoff time (BCASXXXX.dmz: System)