We are experiencing consistent log duplication and data loss when the Splunk Universal Forwarder (UF) running as a Helm deployment inside our EKS cluster is restarted or redeployed. Environment Details: Platform: AWS EKS (Kubernetes) UF Deployment: Helm chart Splunk UF Version: 9.1.2 Indexers: Splunk Enterprise 9.1.1 (self-managed) Source Logs: Kubernetes container logs (/var/log/containers, etc.) Symptoms: After UF pod restarts/re-deployed: Previously ingested logs are duplicated. Logs that were generated during the restart window are missing(not all logs) in Splunk. The fishbucket is recreated at each restart: Confirmed by logging into the UF pod post-restart and checking: /opt/splunkforwarder/var/lib/splunk/fishbucket/ Timestamps indicate it is freshly recreated (ephemeral). Our Hypothesis: We suspect this behavior is caused by the Splunk UF losing its ingestion state (fishbucket) on pod restart, due to the lack of a PersistentVolumeClaim (PVC) mounted to: /opt/splunkforwarder/var/lib/splunk This would explain both: Re-ingestion of previously-read files (-> duplicates) Fail to re-ingest certain logs that may no longer be available or tracked (-> causing data loss) However, we are not yet certain if the missing logs are due to non-persistent fishbucket and container log rotation What We Need from Splunk Support: How can we conclusively verify whether the missing logs are caused by fishbucket loss, file rotation, inode mismatch, or other ingestion tracking issues? What is the recommended and supported approach for maintaining ingestion state in a Kubernetes/Helm-based Splunk UF deployment? Is mounting a PersistentVolumeClaim (PVC) to /opt/splunkforwarder/var/lib/splunk sufficient and reliable for preserving fishbucket across pod restarts? Are there additional best practices to prevent both log loss and duplication, especially in dynamic environments like Kubernetes?
... View more