Getting Data In

Should I activate Indexer Acknowledgement and Persistent Queuing on all forwarders to prevent data loss when upgrading an indexer cluster?

pinVie
Path Finder

Hello all - hope someone can tell me if the following is a good idea.

I have to upgrade an Indexer cluster and search heads from 6.1.2 to 6.2.4 without losing data sent from any forwarder during the indexer downtime.

So what I want to do is to make the forwarders cache all information while the indexers are down. afaik, universal forwarders do this by default, but only 500 KB in memory. I am very sure that 500 KB are not enough so what I want to do is to activate Indexer Acknowledgement and Persistent Queueing on all forwarders. I'd activate it before updating the indexers and deactivate it as soon as the indexers are running fine and the cache has been emptied out.

Two questions:
- In general, is this a good idea to prevent loss of incoming logs. I have forwarder with multiple functionality (reading windows event logs, receiving syslog, reading custom logfiles, ...)?
- Can persistent queuing be deactivated without problems?

Thx a lot !

0 Karma

maciep
Champion

Not an answer, but did want to mention that I submitted an enhancement request to allow for a rolling upgrade of the indexer cluster regardless if major/minor/maintenance upgrade. I find it ridiculous that I have to bring down my entire HA cluster to do an upgrade.

Not sure if that will be implemented soon or ever, but maybe the more people that request it, the more priority they'll give it.

Good luck not losing events during your upgrade, hopefully this approach will work.

0 Karma

atari1050
Path Finder

hello-
That really depends on the velocity of your data coming in and what type it is (TCP/UDP=OK, File-based=NO).

Following this guide: http://docs.splunk.com/Documentation/Splunk/6.2.4/Data/Usepersistentqueues

It looks to be configurable, but you are also going to have to wade through a bunch of error messages, and your indexers are going to be seriously overtaxed for whatever length of time that you would have the Cluster Master down.

You should be testing this in a Sandbox environment before even seriously considering it in a Prod one.

Hypothetically, it looks like it could work, but you also need to consider any possible spikes in data and your users.

Sincerely,
Mike

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Monitoring AI Agents with Splunk Observability Cloud

Let’s say I’m running a travel planning AI app in production. A user asks for three concise hotel options in ...

[Puzzles] Solve, Learn, Repeat: Tiling

This puzzle (first published here) is based on finding groups of tessellated tiles (inspired by floor tiles I ...

SOK it to Me: Top 3 Benefits of Using Splunk Operator on Kubernetes that’ll Make ...

    Thursday, July 9, 2026  |  11:00AM–12:00PM PDT Duration: 1 hour (includes Q&A) Managing can feel like a ...