So the title is what I need in a nutshell but I'll elaborate. We have a large-ish installation with more than 100,000 hosts being indexed. Since we're indexing data for such a large amount of hosts, we've left our installation open so that anyone in the corporation can send data to us via syslog, snmp traps, or splunk forwarders, given that they have the right IP addresses.
The problem we have now though, is that within our organization, we have no guarantee that hostnames will not be duplicated within our environment. We do hoever, have a gaurantee that hostname.domain will be unique. Since we know that this is unique, we want to be able to report on it via a script. Basically a check that says "this host has reported into splunk in the last 5 minutes, and here's the sources that have reported in".
My two thoughts are, have the host field set to the FQDN, or the dns domain included as a field in the records being sent. The problems here are:
So in a nutshell, I'd like to be able to get that domain info somehow included, but without any extra work on the indexers, or rather, entirely via the forwarders themselves. Then it's as simple as a REST API call that looks for host=host.domain, or at least host=hostname domain=domain, either which is fine.
So if you can't trust your DNS your situation isn't too bright. Because you are then reliant on the hosts themselves to tell you the truth. Given the size of your environment this might not be possible but check out the following link
I would try fixing your dns problem first.
well dns not being 100% correct isn't a problem for us per-se. More like it's not important to have everything in DNS, and fixing it for all 100000+ hosts will likely never happen.
I was hoping to not have to change the hostname to include the fqdn ultimately, and to not alter the syslog messages themselves. But rather have the forwarder set the hostname to include the fqdn(not possible as I understand it), or put the fqdn or domain into another field that gets forwarded along with the log entries.
One option might be to use a host rewrite on your parsing/indexing Splunk systems to append or change the domains as needed:
The problem would be determining what that domain should be for each event, and if the event data itself will tell you what that domain should be.
Trying to avoid that unfortunately. While I know it's possible I'm hoping to push the tagging of the records onto the hosts/forwarders themselves.