Getting Data In

Does the indexer acknowledgement queue/list persist across Splunk restarts?

Glenn
Builder

I'd like to user indexer acknowledgement in my HA setup when forwarding from a primary indexer which receives events from forwarders, to a secondary indexer (despite the horrible proliferation of duplicate events it can cause, but that's another issue).

I'd like to know whether the queue or list of unacknowledged events maintained on the primary indexer will persist if the primary indexer is restarted (while the secondary is still unavailable).

If it doesn't, we could easily lose the queue and have gaps in our secondary index, breaking HA.

0 Karma
1 Solution

yannK
Splunk Employee
Splunk Employee

If I am correct, the ACK=true option (on the forwarder) will cause the forwarder to wait for an acknowledgement from the indexer that the event has been written to disk.
So If the indexer goes down, the forwarder will retry.
As you can see, at the end it will not cause gaps, only accidental duplicates.

Edit :

In the case of a forwarder

About tailing :

  • File Tailing queues, splunk keeps track of the position it was reading a file, and if it is restarted, it will restart from that point.
  • Scripted inputs/network inputs (example syslog on port 514), a splunk instance will store thoses queues in memory, and cannot recover them. An easy workaround is to use a syslog-ng/rsyslog server to write the log to file, and have splunk monitor the files (equivalent to disk buffer).

About queues :

About HA acknowledgement :

View solution in original post

sdaniels
Splunk Employee
Splunk Employee

You could use persistent queues on the forwarder so that the data is there after restart, otherwise you'll lose that data in memory.

http://docs.splunk.com/Documentation/Splunk/4.3.1/Data/Usepersistentqueues

Glenn
Builder

Thanks, this is helpful.

0 Karma

yannK
Splunk Employee
Splunk Employee

If I am correct, the ACK=true option (on the forwarder) will cause the forwarder to wait for an acknowledgement from the indexer that the event has been written to disk.
So If the indexer goes down, the forwarder will retry.
As you can see, at the end it will not cause gaps, only accidental duplicates.

Edit :

In the case of a forwarder

About tailing :

  • File Tailing queues, splunk keeps track of the position it was reading a file, and if it is restarted, it will restart from that point.
  • Scripted inputs/network inputs (example syslog on port 514), a splunk instance will store thoses queues in memory, and cannot recover them. An easy workaround is to use a syslog-ng/rsyslog server to write the log to file, and have splunk monitor the files (equivalent to disk buffer).

About queues :

About HA acknowledgement :

cphair
Builder

@yannK, do you have specific measures of the cost of enabling HA acknowledgement beyond what's in the document you linked? I understand the memory usage on the forwarder side would increase, but I'd like to know the effect on the indexer side as well.

0 Karma

Glenn
Builder

Looks good thanks. I think persistant queues is what I was looking for.

0 Karma

yannK
Splunk Employee
Splunk Employee

edited above.

0 Karma

Glenn
Builder

I should have been more clear. I mean, what happens to the queue on the forwarder, if it goes down while the indexer is already down. ie. does the forwarder's queue still have the same data after it is restarted?

0 Karma
Get Updates on the Splunk Community!

Harnessing Splunk’s Federated Search for Amazon S3

Managing your data effectively often means balancing performance, costs, and compliance. Splunk’s Federated ...

Infographic provides the TL;DR for the 2024 Splunk Career Impact Report

We’ve been buzzing with excitement about the recent validation of Splunk Education! The 2024 Splunk Career ...

Enterprise Security Content Update (ESCU) | New Releases

In December, the Splunk Threat Research Team had 1 release of new security content via the Enterprise Security ...