Getting Data In

Duplicate indexing of data

Path Finder

I have situation in hand here...

I have a abc.txt file in server1 which I am monitoring using a forwarder.

The abc.txt file updates every 1 hour in such a way that, the content of the whole file is cleared and the same content is written back to the abc.txt file.

The issue in hand is, splunk is indexing the data from abc.txt everytime the content is removed and written back to the abc.txt file, which is resulting in duplication of the data multiple times.

can somebody please help me in rectifying the issue..?? do i need to change the crcinitlength value..??

Tags (2)
0 Karma
1 Solution

Path Finder

After some R&D, i could figure out what was causing the issue and how to fix it.

The issue was with how the script was writing the data in the output file from where Splunk was forwarding the data.
The script was configured in such a way that it would erase the existing data from the file and write the existing data + the new data in the file.
This made Splunk to believe that it was a new data and index the same data all over again.
So basically, the number of times the script was running, it would create that many duplicates.

To fix the issue, what we did is, instead of rewriting the complete data all over again and again in the file, only the new data was written into the file which avoids any duplication whatsoever.

Hope I made it clear for everyone following the question.

View solution in original post

0 Karma

Path Finder

@soumdey - The great thing is you reduced the duplicate usage of Splunk License.

0 Karma

Path Finder

After some R&D, i could figure out what was causing the issue and how to fix it.

The issue was with how the script was writing the data in the output file from where Splunk was forwarding the data.
The script was configured in such a way that it would erase the existing data from the file and write the existing data + the new data in the file.
This made Splunk to believe that it was a new data and index the same data all over again.
So basically, the number of times the script was running, it would create that many duplicates.

To fix the issue, what we did is, instead of rewriting the complete data all over again and again in the file, only the new data was written into the file which avoids any duplication whatsoever.

Hope I made it clear for everyone following the question.

View solution in original post

0 Karma

Path Finder

Can somebody please help me out here...???

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!