Getting Data In

Why is my auto index uploading a CSV file twice?

katzr
Path Finder

Hello,

I think I have a problem where my auto index is uploading a file twice- the original file placed in the auto index directory and another file with the same name but with a combination of letters/number then .partial see below

example.csv
example.csv.XYZABC1.partial

I need to figure out how to fix this problem- but I also want to determine if this partial file is the exact same as the original so I can delete it. Is there an easy way to run a search against two sources and compare all of the field values per field to see if they are the same and the file is the same?

Thanks!

0 Karma

micahkemp
Champion

You need to specify your monitor stanza such that it will only index completed csv files:

[monitor://<path>/*.csv]

instead of

[monitor://<path>/*]
0 Karma

katzr
Path Finder

thank you- I will adjust my monitor stanza- do you know an easy way to run a search against two sources and compare all of the field values per field to see if they are the same and the file is the same?

0 Karma

FrankVl
Ultra Champion

The partial file is probably caused by whatever tool you use to upload the csv file, apparently it does so in chunks and only once complete, renames it to the proper filename. It appears this is taking so long that Splunk already picks up on the partial file, before the upload completes.

So I would expect Splunk will only hold a subset of events with source=*partial compared to the proper source.

Making the monitor path more specific as suggested by micahkemp would indeed fix that.

0 Karma

micahkemp
Champion

I would suggest posting that as a separate question. You're likely to get better answers, due to a larger audience.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

[Puzzles] Solve, Learn, Repeat: Character substitutions with Regular Expressions

This challenge was first posted on Slack #puzzles channelFor BORE at .conf23, we had a puzzle question which ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...

[Puzzles] Solve, Learn, Repeat: Matching cron expressions

This puzzle (first published here) is based on matching timestamps to cron expressions.All the timestamps ...