All Apps and Add-ons

Stop s3 log file from re-indexing.

jkostovich
Explorer

Hello,

I have a log file stored in an s3 bucket that i am trying to index properly. I am using SQS based messaging to pull the log down with the Splunk_AWS_Add-on.

The current process that is not working is.

  1. App overwrites old log with current in s3.
  2. SQS fires off and records the change in a entry.
  3. Splunk looking at that SQS stream pulls the message down and attempts to index's it.

The log is being re-indexed completely every time this happens instead of simply the new data in the log.
According to splunk documentation the way indexing works is it looks at the first and last 256 characters of a file to determine differences. If it finds any at the top it immediately reindexes, but if not then it parses through the file until it finds anything different then only adds new events to the index if they are appended at the end.

This should work perfectly fine with a file with appends at the end like the one I am currently using. After searching through lots of configuration and documentation notes I havn't found anything on the Splunk side that could remedy this. All options point to using a input file and its parameters which is not an option with SQS based retrieval in the add-on.

The one thing I have found that may be causing this is that it appears S3 places metadata on each file. Now with each new upload and overwrite that metadata would change. Is splunk reading this metadata and marking it as a new file and immediately re indexing the entire thing?

Any help on this would be greatly appreciated. This is the last hiccup in making a standard for our custom logs.

0 Karma

diptij
Path Finder

I'm seeing a similar problem. 

The file hashes are the same but the 'stat <file>' shows the metadata has changed.  Splunk seems to also re-read a file if the metadata has changed.

Did you get a resolution?

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Announcing Modern Navigation: A New Era of Splunk User Experience

We are excited to introduce the Modern Navigation feature in the Splunk Platform, available to both cloud and ...

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

We are excited to announce that the upcoming releases of Splunk Enterprise 10.2.x and Splunk Cloud Platform ...

Step into “Hunt the Insider: An Splunk ES Premier Mystery” to catch a cybercriminal ...

After a whole week of being on call, you fell asleep on your keyboard, and you hit a sequence of buttons that ...