Getting Data In

avoid duplicate indexing in splunk

c_sahil
New Member

I have a scheduler which logs the data to my log file every hour, the log I use in splunk. Now the problem is every time scheduler runs it appends some row but in the splunk when I query I get (double the no. of rows + added rows as result). how can avoid this duplicate indexing. below ex. will clearly explain my problem.

before scheduler run log have 3 rows. after scheduler run it add 1 more row to log and total no. of rows in log is 4 but in splunk when I query it gives me (3*2+1) 7 rows how to avoid this. please help

Tags (1)
0 Karma
1 Solution

Matthias_BY
Communicator

Hello,

how are you loading in the data? Via Splunk File Monitoring inputs.conf?

Splunk Forwarders + Indexer in File Monitoring are remembering in files when did it stop, up to which line it was captured already etc. this is called "Fish Bucket".

You can read more here:
http://blogs.splunk.com/2008/08/14/what-is-this-fishbucket-thing/

Br
Matthias

View solution in original post

0 Karma

Matthias_BY
Communicator

Hello,

how are you loading in the data? Via Splunk File Monitoring inputs.conf?

Splunk Forwarders + Indexer in File Monitoring are remembering in files when did it stop, up to which line it was captured already etc. this is called "Fish Bucket".

You can read more here:
http://blogs.splunk.com/2008/08/14/what-is-this-fishbucket-thing/

Br
Matthias

0 Karma

Matthias_BY
Communicator

Hi,

great. so just to make clear. your application which did write the log Splunk monitors did write new lines at the bottom instead appending it, right?

0 Karma

c_sahil
New Member

Thanks Matthias...
Fixed the issue after reading the blog, I was adding the data in the log at start instead of adding it at the bottom.
Thank a lot.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Announcing Modern Navigation: A New Era of Splunk User Experience

We are excited to introduce the Modern Navigation feature in the Splunk Platform, available to both cloud and ...

Detection Engineering Office Hours: Real-World Troubleshooting & Q&A

[REGISTER HERE] This thread is for the Community Office Hours session on Detection Engineering Office Hours: ...

Developer Spotlight with Mika Borner

From Hackathon Winner to Enterprise Leader    Mika Borner, CEO and Founder of Datapunctum AG, has been ...