Deployment Architecture

AIX HACMP and shared storage log monitor

Engager

Hi, is anyone doing this sort of configuration?
AIX HACMP cluster with shared storage, log file is on shared storage.
Active passive cluster so node A is active, B is standby.
Splunk tail watches logfiles on shared storage, shared storage active on A side.
A logs to splunk fine.

Now, if we were to fail over to B node, stop splunk on A as failover happens.
B starts up, and sees logs as "new", therefore logs duplicate records to splunk.

Is there a simple way to tell splunk to not read all historical data?
Is there a way to have a virtual splunk "instance" that can run on either node alongside the "actual" splunk instance for each node?

Any best practice for clustered environment with shared storage docs?

Thanks in advance 🙂

Tags (3)

Engager

Much obliged for that, I'll review our current setup (running local/local splunk forwarders) at the moment.

0 Karma

SplunkTrust
SplunkTrust

Best practice from an HACMP (or PowerHA if you prefer - marketing people sigh) perspective is to have a Splunk forwarder on the shared storage and as part of the resource group. Each system runs a forwarder for its own local, non-clustered logfiles - but the "shared" instance deals with all of the clustered data. This is pretty common stuff for HACMP environments where there is state that needs to be dealt with out on the shared storage.

Your resource group stop/start scripts need to take into account stopping and starting the "cluster" Splunk forwarder. You should probably also plan for it to be on a mount point that is globally unique within the cluster.

This is basically your "virtual" instance mentioned above. It's not that difficult to set up. On Unix systems, Splunk doesn't care how many instances of it you're running. There's a couple of gotchas like port#'s and (maybe) SPLUNK_BINDIP. (You may want to only bind the resource-group forwarder to a resource-group IP, for example)

As with anything cluster related, test test test in a variety of normal, failure, and maintenance modes.