Getting Data In

Duplicate and Missing Logs After Splunk Universal Forwarder Pod Restart in EKS

Ravi1
Loves-to-Learn

We are experiencing consistent log duplication and data loss when the Splunk Universal Forwarder (UF) running as a Helm deployment inside our EKS cluster is restarted or redeployed.

Environment Details:

  • Platform: AWS EKS (Kubernetes)

  • UF Deployment: Helm chart

  • Splunk UF Version: 9.1.2

  • Indexers: Splunk Enterprise 9.1.1 (self-managed)

  • Source Logs: Kubernetes container logs (/var/log/containers, etc.)

 

Symptoms:

  1. After UF pod restarts/re-deployed:

    • Previously ingested logs are duplicated.

    • Logs that were generated during the restart window are missing(not all logs) in Splunk.

  2. The fishbucket is recreated at each restart:

    • Confirmed by logging into the UF pod post-restart and checking:
      /opt/splunkforwarder/var/lib/splunk/fishbucket/

    • Timestamps indicate it is freshly recreated (ephemeral).

 

Our Hypothesis:

We suspect this behavior is caused by the Splunk UF losing its ingestion state (fishbucket) on pod restart, due to the lack of a PersistentVolumeClaim (PVC) mounted to:

/opt/splunkforwarder/var/lib/splunk
 

This would explain both:

  • Re-ingestion of previously-read files (-> duplicates)

  • Fail to re-ingest certain logs that may no longer be available or tracked (-> causing data loss)

However, we are not yet certain if the missing logs are due to non-persistent fishbucket and container log rotation

What We Need from Splunk Support:

  • How can we conclusively verify whether the missing logs are caused by fishbucket loss, file rotation, inode mismatch, or other ingestion tracking issues?

  • What is the recommended and supported approach for maintaining ingestion state in a Kubernetes/Helm-based Splunk UF deployment?

  • Is mounting a PersistentVolumeClaim (PVC) to /opt/splunkforwarder/var/lib/splunk sufficient and reliable for preserving fishbucket across pod restarts?

  • Are there additional best practices to prevent both log loss and duplication, especially in dynamic environments like Kubernetes?

Labels (1)
0 Karma

livehybrid
SplunkTrust
SplunkTrust

Hi @Ravi1 

I agree that the loss of the fishbucket state (due to ephemeral storage) is the cause of both log duplication and data loss after Splunk Universal Forwarder pod restarts in Kubernetes. When the fishbucket is lost, the UF cannot track which files and offsets have already been ingested, leading to re-reading old data (duplicates) and missing logs that rotated or were deleted during downtime.

If logs are rotated (e.g. to myapp.log.1) and Splunk is not configured to monitor the rotated filepath then this could result in your losing data as well as the more obvious duplicate of data due to the file tracking within fishbucket being lost.

As far as I am aware, the approach of using a UF within K8s is not generally encouraged, instead the Splunk validated architecture (SVA) for sending logs to Splunk from K8s is via Splunk OpenTelemetry Collector for Kubernetes - this allows sending of logs (amonst other things) to Splunk Enterprise / Splunk Cloud.

If you do want to use the UF approach (which may/may not be supported) then you could look at adding PVC as is done with the full Splunk Enterprise deployment under splunk-operator, check out the Storage Guidelines and StorageClass docs for splunk-operator.

🌟 Did this answer help you? If so, please consider:

  • Adding karma to show it was useful
  • Marking it as the solution if it resolved your issue
  • Commenting if you need any clarification

Your feedback encourages the volunteers in this community to continue contributing

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Leveraging Automated Threat Analysis Across the Splunk Ecosystem

Are you leveraging automation to its fullest potential in your threat detection strategy?Our upcoming Security ...

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...