Deployment Architecture

Avoid indexing same file multiple times batch input

ips_mandar
Builder

I have batch input

[batch://C:\abc\*.zip]
move_policy = sinkhole
index = xyz
host_segment = 2
crcSalt = <SOURCE>
sourcetype = pqr
disabled = false

for testing I added one zip file in monitored folder after consumed by splunk I again added same file in monitored folder and I found duplicate events. I was assumed that it will not index same file since I have included crcSalt=<SOURCE>. What can be done avoid duplication?

Note- file monitored is zip- csv file with headers.

Tags (1)
0 Karma

gcusello
Esteemed Legend

Hi ips_mandar
file names in zip files are the same or different?
both the times you had files in zip?
crcSalt=<SOURCE> guarantees that you don't index twice files with the same name, but if you have the same filename with a different path you have two files.

Bye.
Giuseppe

0 Karma

ips_mandar
Builder

Hi @gcusello,
I am manually copying same zip file to monitor directory and number of times I am pasting files in monitored folder same number of times it is duplicating events with same source.
and zip file name and inside file name are same .

0 Karma

ips_mandar
Builder

Not sure if it works for batch input since it works for monitor input.

0 Karma

ips_mandar
Builder

Any idea anyone to avoid indexing same file multiple time in batch input?

0 Karma
Get Updates on the Splunk Community!

The Splunk Success Framework: Your Guide to Successful Splunk Implementations

Splunk Lantern is a customer success center that provides advice from Splunk experts on valuable data ...

Splunk Training for All: Meet Aspiring Cybersecurity Analyst, Marc Alicea

Splunk Education believes in the value of training and certification in today’s rapidly-changing data-driven ...

Investigate Security and Threat Detection with VirusTotal and Splunk Integration

As security threats and their complexities surge, security analysts deal with increased challenges and ...