Getting Data In

Missing events syslog forwarding to heavy forwarder

w199284
Explorer

I need help troubleshooting an issue where I am missing events being forwarded from a linux syslog daemon to my heavy forwarders. Beginning the first day of each month, for three or four days, this feed drops from ~50,000 indexed events per hour to maybe ~150. Then, magically, the feed resumes ~50,000 events per hour for the remainder of the month. Only this one index source is affected. All traffic is UDP.

To troubleshoot:

  • I've removed the load balancer from the equation and send directly to one heavy forwarder
  • We can see the syslog events leaving the source server 
  • Using tcpdump, I can see events from the source server hitting port 514 on the heavy forwarder
  • I have a dashboard showing blocking on agg, index, parsing and typing queues. There is none.
  • I tested the regexs in my transforms on the actual events captured with tcpdump. All test correctly.

While this event was in progress

  • Opened a support case with diag logs from the HWF and one of my indexer servers (nothing yet)
  • There are no errors or warnings in the internal logs for the heavy forwarder used in this test. 
  • I've looked at all the log channels on the HWF (1236 of them) but I don't know which one(s) to elevate the logging level for
  • I tried starting splunk with --debug but I do not see any additional internal logging. I may not have done this correctly. (splunk start --debug)

More strange

  • There are two syslog feeds from the source server being used in this troubleshooting effort. The second feed is unaffected.
  • There are 78 source servers in this group. All exhibit the same behavior making it seem that splunk is the common denominator. 

I do use a props and transforms configuration for port 514 to parse the index name and sourcetype for a multitude of incoming syslog feeds bound for different indexes. This configuration has not changed for a very long time (and does not change at the first of the month - for a few days).

Frankly I'm lost. There must be a way to expose what is happening to these events either at the heavy forwarder or on the indexers but I'm out of ideas. Does anyone have a thought about how I might capture the information I need to diagnose whatever is happening? At this time, the feed has returned to normal i.e. ~50,000 indexed events per hour. Thank you in advance for any advice you have. 

 

 

Labels (2)
0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...