topic Re: avoid duplicate indexing in splunk in Getting Data In

avoid duplicate indexing in splunk

c_sahil — Mon, 28 Apr 2014 09:00:13 GMT

I have a scheduler which logs the data to my log file every hour, the log I use in splunk. Now the problem is every time scheduler runs it appends some row but in the splunk when I query I get (double the no. of rows + added rows as result). how can avoid this duplicate indexing. below ex. will clearly explain my problem.

before scheduler run log have 3 rows. after scheduler run it add 1 more row to log and total no. of rows in log is 4 but in splunk when I query it gives me (3*2+1) 7 rows how to avoid this. please help

Re: avoid duplicate indexing in splunk

Matthias_BY — Mon, 28 Apr 2014 10:53:16 GMT

Hello,

how are you loading in the data? Via Splunk File Monitoring inputs.conf?

Splunk Forwarders + Indexer in File Monitoring are remembering in files when did it stop, up to which line it was captured already etc. this is called "Fish Bucket".

Br
Matthias

Re: avoid duplicate indexing in splunk

c_sahil — Mon, 28 Apr 2014 13:27:06 GMT

Thanks Matthias...
Fixed the issue after reading the blog, I was adding the data in the log at start instead of adding it at the bottom.
Thank a lot.

Re: avoid duplicate indexing in splunk

Matthias_BY — Mon, 28 Apr 2014 14:16:12 GMT

Hi,

great. so just to make clear. your application which did write the log Splunk monitors did write new lines at the bottom instead appending it, right?