Getting Data In

How to avoid indexing duplicate events with files being rotated and compressed?

srenou
New Member

Hello,

We have a weblogic instance that writes its log file using log rotation as well as compression of the file.
When the box is under strong load, we can see in Splunk some data appearing twice as they get indexed when the file access.log is processed and reindexed when the file access.log.1.gz is being processed.
We also see some data that were indexed only in file access.log.1.gz.

Ignoring the gz file would mean that we would lose those extra data, but in the current situation, we are getting duplicate data being processed.
Is there a workaround to that situation to not lose any data and avoid processing duplicate data?
I saw some proposal to remove the duplicates after the processing (| dedup _raw`), but this sounds like an after the fact item.

It also appears that this does not occur if the system is not under stress, so Splunk is able to identify on occasion that the file 1.gz is the compressed version of the previously indexed access.log.

Thanks for any help as I am trying to get all my data and no duplicates.

0 Karma

nettrigger
Explorer

I have the same problem and to this day, Splunk has not been able to give me a professional and real answer. Disappointing.

Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Observability Simplified: Combining User Experience, Application Performance & ...

Tech Talk Observability Simplified: Combining User Experience, Application Performance & Network ...

Event Series May & June: From Network Visibility to Service Intelligence

Unifying the Network: Moving from Alert Noise to Service Intelligence with Splunk ITSI In today’s hybrid ...

Global Splunk User Group Events: May + June 2026

Your Splunk Community Awaits: Discover Upcoming User Group Events Worldwide    Staying ahead in the fast-paced ...