Getting Data In

Unable to initialize modular input "WinEventLog" after server restart

gportnoy
Explorer

Having this intermittent problem with UF on multiple servers where it occasionally fails to start up the WinEventLog component after a system restart. This is happening on a number of servers and we only started seeing this after upgrading them to Windows Server 2016. When the service starts it logs these two lines:

06-23-2019 04:44:20.122 +0000 ERROR ModularInputs - Unable to initialize modular input "WinEventLog" defined in the system context: Introspecting scheme=WinEventLog: script running failed (exited with code 255).
06-23-2019 04:44:19.575 +0000 ERROR ModularInputs - Introspecting scheme=WinEventLog: killing process, because executing it took too long (over 30000 msecs).

When this happens, other input modules will continue to read events. For example, _internal, stream and others data continues to get sent from this system, but nothing will be processed from the Event Log. Restarting the Splunk UF service on the server instantly fixes the problem, so I know it's not a problem with inputs.conf or anything else. It simply seems that some component fails to start up within 30 seconds and Splunk gives up on it. The fact that this happens intermittently on the same system (some restarts everything is fine and other times this happens) confirms this. Things I tried:

  • Changing the service to Delayed Start - No change. Found some obscure documentation that in Server 2016 Microsoft configured the services that get launched with Delayed Start to run with lowest priority. https://blogs.technet.microsoft.com/askperf/2008/02/02/ws2008-startup-processes-and-delayed-automati... . Relevant quote: "The Service Control manager also sets the priority of the initial thread for these delayed services to THREAD_PRIORITY_LOWEST. This causes all of the disk I/O performed by the thread to be very low priority."
  • Upgraded from 7.1.3 to 7.2.x - No change
  • Ticket with support. There are no tune-able parameters for this. Turning on debug logging for this module "category.ModularInputs=DEBUG" did not reveal any additional helpful information.

Only idea i have left is to brute-force this and add a scheduled task to restart the service 10-15 minutes after a system restart, but before I do this, any suggestions from the community?

0 Karma

gn694
Communicator

Did you ever get this problem resolved? I just encountered the same issue on Server 2016.

0 Karma

gportnoy
Explorer

Never figured it out. Same problem even with the newer versions of the UF. We fell back to restarting the service through a scheduled task 10 minutes after a server gets restarted. That seems to clear it up for the most part.

0 Karma

AhmadGul23
Loves-to-Learn Everything

Hi, has anyone had an update on this? I am facing this issue as well on a Windows server. 

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...