Getting Data In

Monitor a file based on hostname

mslvrstn
Communicator

I thought this would be easy to do, but I didn't see any way to to this in inputs.conf.spec

I have a cluster of machines where the app is configured to log directly to shared storage.
Each host writes to files named something like:

  • \\storage\logs\host1.log
  • \\storage\logs\host2.log

I have configured the splunkforwarder on each host in the cluster with servername matching hostN

How can I configure each forwarder to monitor the shared directory, but only ingest the logs that were actually written by the local host?

Note, this question is not about how to parse the host name from the filename. I know how to do that, but I don't want each forwarder sending every file to the indexers, only to restrict the monitoring to files that match the local servername.

Tags (2)
0 Karma

mslvrstn
Communicator

In the end, we changed the config so that the forwarders wrote to local storage. Having Splunk centralizing access to the logs alleviated most of the need to have them on the shared storage anyway.

0 Karma

jrodman
Splunk Employee
Splunk Employee

It seems you want to have a given forwarder monitor a specific file in a directory, and not other files, or perhaps files in a directory which match a particular pattern, and not other files.

So on, host1, where you want to monitor

\\storage\logs\host1.log

You can use a configuration such as

[monitor://\\storage\logs\host1.log]

Or, if you want to match a pattern

[monitor://\\storage\logs\host1*.log]

Then on host2 you can do the same.

If you have an environment variable which is going to expand to the hostname, you could try

[moniitor://\\storage\logs\${VARIABLE}*.log]

but I can't guarantee success with this pattern; I think we only use it to expand at the beginning of the string ourselves, and don't really document the behavior.

mslvrstn
Communicator

Your comment about ${VARIABLE} is the most intriguing.
If that works, I'd like to see that documented, although we ended up going a different route.

0 Karma

yannK
Splunk Employee
Splunk Employee

What you want to do is called : dynamically set the default host value.
Using the "host_segment" parameter.

see http://docs.splunk.com/Documentation/Splunk/4.3/Data/Setadefaulthostforaninput

It has to defined in your input, by example for a host in the third part of the path

\storage\logs\host1.log
\storage\logs\host2.log

inputs should look like

[monitor:\storage\logs*.log]
host_segment = 3

0 Karma

mslvrstn
Communicator

No, what you mention is about parsing the host from the filename.
I was looking to limit what files were ingested, based on matching the hostname in the file.

0 Karma

FunPolice
Path Finder

Do you really need to have forwarders on every machine monitoring the same directory? I expect that it would be a lot simpler and therefore more reliable (and use a lot less horsepower) to have one forwarder monitoring the directory. You could then extract the host name from the file name at index time.

0 Karma

mslvrstn
Communicator

Not really, the shared storage itself is multi-node.

0 Karma

jrodman
Splunk Employee
Splunk Employee

But if you're storing all the data on shared storage, you still have a single point of failure.

0 Karma

mslvrstn
Communicator

Yes, its about not having a single point of failure for the ingestion of the logs, but also not wanting to eat them multiple times.
Putting the forwarder on each node allows each node to be responsible for its own logs. It has the added benefit that it allows us to monitor node performance as well.
If a node goes down, the remaining nodes continue processing and their logs continue to get consumed by spunk.

0 Karma

lguinn2
Legend

Best option

Create subdirectories for each host, even if you have multiple directories with the same name. Then put a stanza like this in inputs.conf

[monitor:///var/log/.../host1]

Other Option

Create a skeleton inputs.conf

Write a script to generate a real inputs.conf that is populated with the server name.

You could even integrate this script with the script to install the Universal Forwarder

0 Karma

mslvrstn
Communicator

Sorry, neither option is really what I'm looking for.
#1 isn't right because it doesn't take into account the differing forwarder hostnames.

Separating them out to their own directory isn't really the issue.
It's matching the path to the hostname/servername of the forwarder, wherever that happens to be in the monitored filepath.

As for #2, I know I could write an external script to modify the inputs.conf, but ideally there is some way to do it from within Splunk, using some kind of variable substitution in the monitor stanza.

If it helps, the app will be distributed via deployment server.

0 Karma
Did you miss .conf21 Virtual?

Good news! The event's keynotes and many of its breakout sessions are now available online, and still totally FREE!