Getting Data In

Data being auto-indexed as .tmp file instead of .csv

katzr
Path Finder

Hello,

I have an auto-index set up on a folder in my splunk directory and the past two times when a user copied their data in the .csv form into the folder- this was indexed as .tmp file. How can I fix this problem and ensure .tmp files are not auto-indexed?

The .tmp file was indexed and the actual .csv never got indexed. I deleted the .tmp source type data out of splunk and I deleted the source file out of the directory, renamed it and copied it back over and the data still didn't get indexed this way. I ended up having to just manually upload the file

0 Karma
1 Solution

woodcock
Esteemed Legend

The reason that it did not index it after you fixed it is because by default, Splunk does not consider the file name as uniquely identifying a source (because many systems rotate logs in place and to not do so would mean whenever a log file got rotated to a backup name, it would get indexed again). So Splunk considers /your/path/to/file_foo.csv to be the same file as /your/path/to/file_bar.tmp as long as the first X bytes and last Y bytes match. You can change this behaviour by setting crcSalt=<SOURCE> (yes, use literally that exact string) in your inputs.conf:

http://docs.splunk.com/Documentation/Splunk/latest/admin/Inputsconf

View solution in original post

lfedak_splunk
Splunk Employee
Splunk Employee

Hey @katzr! If @woodcock or @richgalloway solved your problem, please don't forget to accept an answer! You can upvote posts as well. (Karma points will be awarded for either action.) Happy Splunking!

0 Karma

woodcock
Esteemed Legend

The reason that it did not index it after you fixed it is because by default, Splunk does not consider the file name as uniquely identifying a source (because many systems rotate logs in place and to not do so would mean whenever a log file got rotated to a backup name, it would get indexed again). So Splunk considers /your/path/to/file_foo.csv to be the same file as /your/path/to/file_bar.tmp as long as the first X bytes and last Y bytes match. You can change this behaviour by setting crcSalt=<SOURCE> (yes, use literally that exact string) in your inputs.conf:

http://docs.splunk.com/Documentation/Splunk/latest/admin/Inputsconf

richgalloway
SplunkTrust
SplunkTrust

Change your inputs.conf file to add a whitelist attribute to your monitor stanza. Something like whitelist = \.csv$ should limit Splunk to CSV files.

---
If this reply helps you, Karma would be appreciated.

katzr
Path Finder

can I do that for just one specific index though?

0 Karma

woodcock
Esteemed Legend

You do it in inputs.conf for each [monitor://...] stanza.

0 Karma
Get Updates on the Splunk Community!

The Payment Operations Wake-Up Call: Why Financial Institutions Can't Afford ...

The same scenario plays out across financial institutions daily. A payment system fails at 11:30 AM on a busy ...

Make Your Case: A Ready-to-Send Letter for Getting Approval to Attend .conf25

Hello Splunkers, Want to attend .conf25 in Boston this year but not sure how to convince your manager? We've ...

Community Spotlight: A Splunk Expert's Journey

In the world of data analytics, some journeys leave a lasting impact not only on the individual but on the ...