Getting Data In

Duplicate records

jaterlwj
Explorer

I have tested and realized that when monitoring a file with let's say 24 rows with the option "Continuously index data from a file or directory this Splunk instance can access".
I noticed that when I add a new row and refreshes. There are now 49 rows. The older 24 records are being duplicated. Is there any option to stop duplicate rows?

Here are some specifics.
File format: .log
Specify the source:"Continuously index data from a file or directory this Splunk instance can access."

set host: constant value
set source type: manual
destination index:default

Tags (1)
0 Karma

jaterlwj
Explorer

Any help would be good!

0 Karma

jaterlwj
Explorer

Hi, Thanks for your reply.

Pardon me for my ignorance, but what should I look for under the _internal index? There's roughly 1.7m events in there. 😮

0 Karma

ak
Path Finder

check the _internal index. it appears the whole file is being reread, thus 24 + 25 rows.

0 Karma

ak
Path Finder

is your log file terminated with an end of file message, something like [END OF LOG FILE]?

if so, this will confuse splunk. splunk uses the last 256 bytes for CRC. If you have a termination message that is constantly appended to your file, the CRC check will fail. When this happens, splunk rereads the file, thus duplicating records.

See: splunk log rotation

0 Karma

jaterlwj
Explorer

Hi thank you for your reply! But the log file that I used does not contain any end of file message!

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...