I thought this would be easy to do, but I didn't see any way to to this in inputs.conf.spec
I have a cluster of machines where the app is configured to log directly to shared storage.
Each host writes to files named something like:
I have configured the splunkforwarder on each host in the cluster with servername matching hostN
How can I configure each forwarder to monitor the shared directory, but only ingest the logs that were actually written by the local host?
Note, this question is not about how to parse the host name from the filename. I know how to do that, but I don't want each forwarder sending every file to the indexers, only to restrict the monitoring to files that match the local servername.
In the end, we changed the config so that the forwarders wrote to local storage. Having Splunk centralizing access to the logs alleviated most of the need to have them on the shared storage anyway.
It seems you want to have a given forwarder monitor a specific file in a directory, and not other files, or perhaps files in a directory which match a particular pattern, and not other files.
So on, host1, where you want to monitor
You can use a configuration such as
Or, if you want to match a pattern
Then on host2 you can do the same.
If you have an environment variable which is going to expand to the hostname, you could try
but I can't guarantee success with this pattern; I think we only use it to expand at the beginning of the string ourselves, and don't really document the behavior.
What you want to do is called : dynamically set the default host value.
Using the "host_segment" parameter.
It has to defined in your input, by example for a host in the third part of the path
inputs should look like
host_segment = 3
Do you really need to have forwarders on every machine monitoring the same directory? I expect that it would be a lot simpler and therefore more reliable (and use a lot less horsepower) to have one forwarder monitoring the directory. You could then extract the host name from the file name at index time.
Yes, its about not having a single point of failure for the ingestion of the logs, but also not wanting to eat them multiple times.
Putting the forwarder on each node allows each node to be responsible for its own logs. It has the added benefit that it allows us to monitor node performance as well.
If a node goes down, the remaining nodes continue processing and their logs continue to get consumed by spunk.
Create subdirectories for each host, even if you have multiple directories with the same name. Then put a stanza like this in inputs.conf
Create a skeleton inputs.conf
Write a script to generate a real inputs.conf that is populated with the server name.
You could even integrate this script with the script to install the Universal Forwarder
Sorry, neither option is really what I'm looking for.
#1 isn't right because it doesn't take into account the differing forwarder hostnames.
Separating them out to their own directory isn't really the issue.
It's matching the path to the hostname/servername of the forwarder, wherever that happens to be in the monitored filepath.
As for #2, I know I could write an external script to modify the inputs.conf, but ideally there is some way to do it from within Splunk, using some kind of variable substitution in the monitor stanza.
If it helps, the app will be distributed via deployment server.