I have a heavy forwarder that is capturing incoming logs from thousands of Linux hosts. The hosts are sending their OS logs. As known, Linux logs do not identify themselves with an IP in their log sources.
Is their a way to capture their IP, from the receiving port, and parse it to a new field, such as src_ip?
I know we can add identifying information in the hosts outputs.conf file but we are unable to do that due to the circumstances.
The reason I am trying to accomplish this is because a lot of the hosts have a generic name such as "Linux" which is not of value as does not help from an analytical perspective.
Thanks!
So, to recap:
You have a bunch of linux hosts that:
- don't always have a proper hostname
- have a UF running that reads the local syslog and forward it to a central HF
I don't think there really is a simple solution to this. I see a few options:
- make sure your hosts have proper hostnames (which is useful for a lot more than just processing your logs)
- configure each linux system's syslog settings such that it writes the IP address in the log message
- configure each UF with a proper default host value in inputs.conf
- on each linux system add a symlink to the log folder, where the symlink contains the hosts ip address as one of the path fragments, then point splunk at that symlink rather than the physical location, such that you can then use host_segment to get the ip address.
How exactly are collecting this data? You mention a Heavy Forwarder, but in the comments you are also talking about UF?
Are the linux boxes sending syslog over UDP or TCP to the HF, or do you have a UF locally on each linux server, reading /var/log... and forwarding to a HF?
The UFs on the hosts are forwarding to a HF
I notice the sending IP of the UF is being logged under _internal as sourceHost.... Any more ideas to capture that data and ensure its available in index=os?
You can use connection_host = ip in the inputs.conf to force the logs coming from that linux host to have 'ip' in 'host' field.
https://docs.splunk.com/Documentation/Splunk/7.2.4/Admin/Inputsconf
This way, you will be able to check the logs coming from each linux servers (using IP which is unique). Also, splunk assigns hostname upon install - check in $SPLUNK_HOME/etc/system/local/inputs.conf
and this would be added to the Heavy Forwarder inputs.conf correct?
Add connection_host=ip in each of the UF's input.conf
How would that work @lakshman239 ? The UF is on each linux box itself, so either receiving syslog from local host, or using a file monitor input where that setting is not even available as far as I know.
Yes @FrankVl, as far as I understand, the UF is deployed on each of the linux data source and uses monitor stanza/inputs.conf to forward events. So, connection_host param should be able to help.
Except that connection_host is not available for monitor inputs, only for network inputs.
Are you collecting all of the linux logs with a syslog server? If so, you could have you syslog server write the incoming data out to a directory with one of the parent directories labeled as the source IP. Then you could parse out the source IP from the source
field in Splunk
Unfortunately not