Getting Data In

avoid duplicate file ingestion in splunk

test4u
Path Finder

how to remove duplicate files from ingesting in splunk?
i am monitoring a folder in which there is a file names abcd.csv now i make a copy of this file and paste it again in that folder its getting ingested again hot o restrict splunk from doing so ?

Tags (1)
0 Karma

lakshman239
Influencer

if you copy and place the same file, its likely to index it again. As @FrankVl said, the splunk input monitor process checks for the CRC and indexes the files. Pls setup the inputs.conf to index the files/file pattern you need. Additionally you can use whitelist/blacklist.

https://docs.splunk.com/Documentation/Splunk/7.2.4/Data/Howlogfilerotationishandled

https://docs.splunk.com/Documentation/Splunk/7.2.4/Data/Whitelistorblacklistspecificincomingdata

0 Karma

FrankVl
Ultra Champion

The whole point is that by default, Splunk does not index files again if they are an exact copy of already ingested files.
If Splunk is ingesting those files again, that points at some specific config being in place to overrule that default behavior (e.g. changes to crcSalt setting). I would look for the solution there, rather than in changing the pattern or use white/blacklists.

But let's see what the current config is, so we can determine the best course of action 🙂

0 Karma

FrankVl
Ultra Champion

What are your inputs.conf settings for that folder? Because by default Splunk ignores files that have the same content (based on a CRC calculated over the first 256 bytes or so).

0 Karma

test4u
Path Finder

i havent made any changes to inputs.conf as such.following is my inputs.conf

[script://$SPLUNK_HOME\etc\apps\S_APP\bin\S_SCRIPT_FINAL.py]
disabled = false
index = soc
interval = 60.0
sourcetype = csv

0 Karma

FrankVl
Ultra Champion

Right, so it is a scripted input, not a file monitor as your question suggested. So the solution probably needs to be found in the workings of that script.

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...