Getting Data In

How to detect Splunk log ingestion failure?

Explorer

I'm working with Splunk setup to copy and index disk logs from remote servers using scheduled rsync transfer.

The rsync transfer job has a bandwidth limit specified to avoid overloading the remote servers and the Splunk server.

During a recent incident, this rsync bandwidth limit was reached because logs grew too quickly (about 5 GB/hour for about a day). During this time, logs were not transferred for indexing.

This is as designed, but Splunk reports nothing for that timeframe, nor gives an indication that data is missing.

Our Splunk admin says the rsync transfer failures cannot be reported. Is there a way to use Splunk to detect when logs were not indexed as expected?

0 Karma

Legend

You could turn on the Splunk Deployment Monitor app (it comes with Splunk).

It has some dashboards that show which forwarders are forwarding LESS than usual. You can also set alerts from within the Deployment Monitor.

This is why I don't like using rsync unless it is absolutely necessary. You end up having to manually deal with the corner cases when rsync doesn't work. Using a Splunk forwarder is a lot less hassle.

0 Karma

Explorer

Thanks - that sounds sensible.

0 Karma

Builder

I would just set up an alert that emails you once a certain number of events falls below your given threshold.

Explorer

Yes, that would work but may cause a few false positives; for example, when a server was down for maintenance.

0 Karma