All Apps and Add-ons

Stop s3 log file from re-indexing.

jkostovich
Explorer

Hello,

I have a log file stored in an s3 bucket that i am trying to index properly. I am using SQS based messaging to pull the log down with the Splunk_AWS_Add-on.

The current process that is not working is.

  1. App overwrites old log with current in s3.
  2. SQS fires off and records the change in a entry.
  3. Splunk looking at that SQS stream pulls the message down and attempts to index's it.

The log is being re-indexed completely every time this happens instead of simply the new data in the log.
According to splunk documentation the way indexing works is it looks at the first and last 256 characters of a file to determine differences. If it finds any at the top it immediately reindexes, but if not then it parses through the file until it finds anything different then only adds new events to the index if they are appended at the end.

This should work perfectly fine with a file with appends at the end like the one I am currently using. After searching through lots of configuration and documentation notes I havn't found anything on the Splunk side that could remedy this. All options point to using a input file and its parameters which is not an option with SQS based retrieval in the add-on.

The one thing I have found that may be causing this is that it appears S3 places metadata on each file. Now with each new upload and overwrite that metadata would change. Is splunk reading this metadata and marking it as a new file and immediately re indexing the entire thing?

Any help on this would be greatly appreciated. This is the last hiccup in making a standard for our custom logs.

0 Karma

diptij
Path Finder

I'm seeing a similar problem. 

The file hashes are the same but the 'stat <file>' shows the metadata has changed.  Splunk seems to also re-read a file if the metadata has changed.

Did you get a resolution?

0 Karma
Get Updates on the Splunk Community!

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...