Reindex Duplicates / Reindex duplicate data

omuelle1 · ‎01-18-2017

HI Splunkers,

I got a little complicated issue I cannot figure out.

Everyday I receive a host file that we index that contains 80% of that data is duplicate from the days before (same timestamp everything). Splunk goes ahead and only indexes the non duplicates, which I would usually appreciate.

However, for a specific report I need to find a setting where Splunk overwrites the old events and indexes a duplicate from the latest source.

For example:

I get event 1111 - timestamp 1-17  on 1/17 in logfile1_17.txt

on 1-18 I have the event 1111 - timestamp 1-17 in logfile1_18.txt 

Now when I look for this event 1111 , I find as source logfile1_17.txt but I would need it indexed again and have source logfile1_18.txt

Is there a setting that the event 1111 is shown in splunk with the latest indexed file as source or is indexed again ?

Thanks.

Oliver

omuelle1 · ‎01-19-2017

Thank you, I went crcSalt = which also seems to reindex duplicates.

nabeel652 · ‎01-18-2017

If this is the behaviour you want then consider using batch instead of monitor.

[batch://<path>]
disabled = false
move_policy = sinkhole
index = yourindex
sourcetype = somesourcetype

This will index any file put in that directory and delete it once ingested (so keep that in mind that you may lose the file once indexed). However the move_policy set to sinkhole will reindex the same file if put in that folder with different timestamp instead of following the tail.

gurugv · ‎03-21-2017

Would this duplicate existing data? Would the summary indices be automatically updated?

Reindex Duplicates / Reindex duplicate data

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Think Like an Architect: Introducing the Splunk Certified Cybersecurity Defense ...

Best Practices: Splunk auto adjust pipeline queue

Announcing Modern Navigation: A New Era of Splunk User Experience

Join the Conversation

Reindex Duplicates / Reindex duplicate data

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Think Like an Architect: Introducing the Splunk Certified Cybersecurity Defense ...

Best Practices: Splunk auto adjust pipeline queue

Announcing Modern Navigation: A New Era of Splunk User Experience