Getting Data In

Should I activate Indexer Acknowledgement and Persistent Queuing on all forwarders to prevent data loss when upgrading an indexer cluster?

pinVie
Path Finder

Hello all - hope someone can tell me if the following is a good idea.

I have to upgrade an Indexer cluster and search heads from 6.1.2 to 6.2.4 without losing data sent from any forwarder during the indexer downtime.

So what I want to do is to make the forwarders cache all information while the indexers are down. afaik, universal forwarders do this by default, but only 500 KB in memory. I am very sure that 500 KB are not enough so what I want to do is to activate Indexer Acknowledgement and Persistent Queueing on all forwarders. I'd activate it before updating the indexers and deactivate it as soon as the indexers are running fine and the cache has been emptied out.

Two questions:
- In general, is this a good idea to prevent loss of incoming logs. I have forwarder with multiple functionality (reading windows event logs, receiving syslog, reading custom logfiles, ...)?
- Can persistent queuing be deactivated without problems?

Thx a lot !

0 Karma

maciep
Champion

Not an answer, but did want to mention that I submitted an enhancement request to allow for a rolling upgrade of the indexer cluster regardless if major/minor/maintenance upgrade. I find it ridiculous that I have to bring down my entire HA cluster to do an upgrade.

Not sure if that will be implemented soon or ever, but maybe the more people that request it, the more priority they'll give it.

Good luck not losing events during your upgrade, hopefully this approach will work.

0 Karma

atari1050
Path Finder

hello-
That really depends on the velocity of your data coming in and what type it is (TCP/UDP=OK, File-based=NO).

Following this guide: http://docs.splunk.com/Documentation/Splunk/6.2.4/Data/Usepersistentqueues

It looks to be configurable, but you are also going to have to wade through a bunch of error messages, and your indexers are going to be seriously overtaxed for whatever length of time that you would have the Cluster Master down.

You should be testing this in a Sandbox environment before even seriously considering it in a Prod one.

Hypothetically, it looks like it could work, but you also need to consider any possible spikes in data and your users.

Sincerely,
Mike

0 Karma
Get Updates on the Splunk Community!

Monitoring Amazon Elastic Kubernetes Service (EKS)

As we’ve seen, integrating Kubernetes environments with Splunk Observability Cloud is a quick and easy way to ...

Cloud Platform & Enterprise: Classic Dashboard Export Feature Deprecation

As of Splunk Cloud Platform 9.3.2408 and Splunk Enterprise 9.4, classic dashboard export features are now ...

Explore the Latest Educational Offerings from Splunk (November Releases)

At Splunk Education, we are committed to providing a robust learning experience for all users, regardless of ...