Getting Data In

Should I activate Indexer Acknowledgement and Persistent Queuing on all forwarders to prevent data loss when upgrading an indexer cluster?

pinVie
Path Finder

Hello all - hope someone can tell me if the following is a good idea.

I have to upgrade an Indexer cluster and search heads from 6.1.2 to 6.2.4 without losing data sent from any forwarder during the indexer downtime.

So what I want to do is to make the forwarders cache all information while the indexers are down. afaik, universal forwarders do this by default, but only 500 KB in memory. I am very sure that 500 KB are not enough so what I want to do is to activate Indexer Acknowledgement and Persistent Queueing on all forwarders. I'd activate it before updating the indexers and deactivate it as soon as the indexers are running fine and the cache has been emptied out.

Two questions:
- In general, is this a good idea to prevent loss of incoming logs. I have forwarder with multiple functionality (reading windows event logs, receiving syslog, reading custom logfiles, ...)?
- Can persistent queuing be deactivated without problems?

Thx a lot !

0 Karma

maciep
Champion

Not an answer, but did want to mention that I submitted an enhancement request to allow for a rolling upgrade of the indexer cluster regardless if major/minor/maintenance upgrade. I find it ridiculous that I have to bring down my entire HA cluster to do an upgrade.

Not sure if that will be implemented soon or ever, but maybe the more people that request it, the more priority they'll give it.

Good luck not losing events during your upgrade, hopefully this approach will work.

0 Karma

atari1050
Path Finder

hello-
That really depends on the velocity of your data coming in and what type it is (TCP/UDP=OK, File-based=NO).

Following this guide: http://docs.splunk.com/Documentation/Splunk/6.2.4/Data/Usepersistentqueues

It looks to be configurable, but you are also going to have to wade through a bunch of error messages, and your indexers are going to be seriously overtaxed for whatever length of time that you would have the Cluster Master down.

You should be testing this in a Sandbox environment before even seriously considering it in a Prod one.

Hypothetically, it looks like it could work, but you also need to consider any possible spikes in data and your users.

Sincerely,
Mike

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...

Why am I not seeing the finding in Splunk Enterprise Security Analyst Queue?

(This is the first of a series of 2 blogs). Splunk Enterprise Security is a fantastic tool that offers robust ...

Index This | What are the 12 Days of Splunk-mas?

December 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...