Now there are a few things I am trying to decide on since this is a new deployment and I would like to make it as flexible as possible from the start. So after some reading it would appear that installing a LWF on our rsyslog box and forwarding to our Splunk indexers would make the most sense. Most of the logs and not from *nix boxes but network appliances where I can't directly install LWF. In addition I have decided to use multiple indexes since we have some devices that spew logs like crazy (e.g. firewall policy logs) and I would like to contain and create my own retention policies for these indexes.
Seeing as the equipment I have logging to rsyslog is all over the map in terms of brand and format of logs I want to be able to influence the hostname and the sourcetype of the log files before they hit the indexer because I want to be able to apply custom settings on a per host or a per sourcetype basis at the indexer as I see fit. I realize I can do this with everything coming in as syslog however I find that approach kind of cumbersome and I think it makes more sense to define a sourcetype right from the LWF so that my props.conf/transforms.conf make a clear reference to the sourcetype that it will apply it's changes to.
And do whatever per host props/transform at the indexer rather than setting the sourcetype before hand?
I am trying to move away from the approach where your point your monitor statement at a log directory and let it do whatever the heck it wants too since I find it creates lots of noise that isn't very clear and concise and you end up having to write transform after transform for the [syslog] sourcetype.
Does anyone have any opinion on this? What they have found worked out best for them?
I would advise taking a closer look into the forwarding documents so that you can set your whitelists and blacklists. You could also set your sourcetypes statically so that you don't get the varying syslog sourcetypes.