I've been advised that if we want to send a single log file to two different indexers (with ACK's enabled) then we may have HA issues if one of the two clusters is down, i.e. it won't send to the active cluster either after a certain buffer is filled in order to avoid data loss.
The suggested solution was to install a second indexer, and read the same file twice that way. We've been working with this presumption, but now it looks to me like we could instead just define the monitor twice, and use a different crcsalt value in each on (this is actually a directory tree of static gzip files that are not rotated, just deleted after a certain age), to allow them to read the raw disk data twice. Does this seem logical? Potentially via a symlink to the same directory if you can't define the exact same file twice in a monitor?
This may be tough. Splunk (at least per the docs) doesn't support anything other than
SOURCE (with the less-than / greater-than) as the value for CRCSalt.
You're right that trying to define the same dir twice won't work, the config management framework would merge the configs.
A symlink, or in an extreme case, a
mount --bind (Linux only) will let you graft things onto a different path. Perhaps from there, you can do one
monitor:// stanza with a CRCSalt and one without? Or maybe leave both unsalted, but use the
initCrcLen option to make the CRC data size different for one versus the other. Note these are all terribly hackish solutions and could blow up in your face.
current docs say that it can be a any fixed string or "SOURCE", so as long as symlinks are followed, then we can just symlink the dir at a higher level and remain reasonably "Clean" at least at the Splunk level i'd have thought.
* If set, string is added to the CRC. * If set to the literal string "SOURCE" (including the angle brackets), the full directory path to the source file is added to the CRC. This ensures that each file being monitored has a unique CRC. When crcSalt is invoked, it is usually set to <SOURCE>.