Getting Data In

Data being auto-indexed as .tmp file instead of .csv

katzr
Path Finder

Hello,

I have an auto-index set up on a folder in my splunk directory and the past two times when a user copied their data in the .csv form into the folder- this was indexed as .tmp file. How can I fix this problem and ensure .tmp files are not auto-indexed?

The .tmp file was indexed and the actual .csv never got indexed. I deleted the .tmp source type data out of splunk and I deleted the source file out of the directory, renamed it and copied it back over and the data still didn't get indexed this way. I ended up having to just manually upload the file

0 Karma
1 Solution

woodcock
Esteemed Legend

The reason that it did not index it after you fixed it is because by default, Splunk does not consider the file name as uniquely identifying a source (because many systems rotate logs in place and to not do so would mean whenever a log file got rotated to a backup name, it would get indexed again). So Splunk considers /your/path/to/file_foo.csv to be the same file as /your/path/to/file_bar.tmp as long as the first X bytes and last Y bytes match. You can change this behaviour by setting crcSalt=<SOURCE> (yes, use literally that exact string) in your inputs.conf:

http://docs.splunk.com/Documentation/Splunk/latest/admin/Inputsconf

View solution in original post

lfedak_splunk
Splunk Employee
Splunk Employee

Hey @katzr! If @woodcock or @richgalloway solved your problem, please don't forget to accept an answer! You can upvote posts as well. (Karma points will be awarded for either action.) Happy Splunking!

0 Karma

woodcock
Esteemed Legend

The reason that it did not index it after you fixed it is because by default, Splunk does not consider the file name as uniquely identifying a source (because many systems rotate logs in place and to not do so would mean whenever a log file got rotated to a backup name, it would get indexed again). So Splunk considers /your/path/to/file_foo.csv to be the same file as /your/path/to/file_bar.tmp as long as the first X bytes and last Y bytes match. You can change this behaviour by setting crcSalt=<SOURCE> (yes, use literally that exact string) in your inputs.conf:

http://docs.splunk.com/Documentation/Splunk/latest/admin/Inputsconf

richgalloway
SplunkTrust
SplunkTrust

Change your inputs.conf file to add a whitelist attribute to your monitor stanza. Something like whitelist = \.csv$ should limit Splunk to CSV files.

---
If this reply helps you, Karma would be appreciated.

katzr
Path Finder

can I do that for just one specific index though?

0 Karma

woodcock
Esteemed Legend

You do it in inputs.conf for each [monitor://...] stanza.

0 Karma
Get Updates on the Splunk Community!

Aligning Observability Costs with Business Value: Practical Strategies

 Join us for an engaging Tech Talk on Aligning Observability Costs with Business Value: Practical ...

Mastering Data Pipelines: Unlocking Value with Splunk

 In today's AI-driven world, organizations must balance the challenges of managing the explosion of data with ...

Splunk Up Your Game: Why It's Time to Embrace Python 3.9+ and OpenSSL 3.0

Did you know that for Splunk Enterprise 9.4, Python 3.9 is the default interpreter? This shift is not just a ...