Here's an odd one I just noticed. I'm taking Syslog in from a Cisco PIX and I've got the input set up as such:
[udp://5150]
connection_host = dns
sourcetype = syslog
no_priority_stripping = true
no_appending_timestamp = true
I've also got a transform which changes the source type:
[iu_cisco_pix]
REGEX = %PIX-\d-[A-Z0-9_]+:
DEST_KEY = MetaData:Sourcetype
FORMAT = sourcetype::cisco_pix
When I do a search on the source type, I see a number of entries where the host is changed to "bytes":
<166>Aug 13 2012 11:20:54: %PIX-6-302014: Teardown TCP connection 19849033 for dmz:10.88.14.179/80 to inside:10.8.63.254/48574 duration 0:00:01 bytes 2528 TCP FINs
host=bytes sourcetype=cisco_pix source=udp:5150
Most other lines are fine:
<166>Aug 13 2012 11:54:09: %PIX-6-302013: Built outbound TCP connection 19853901 for dmz:10.88.14.179/80 (10.88.14.179/80) to inside:10.8.63.254/1183 (10.8.63.254/1183)
host=eth1.pix-01.network sourcetype=cisco_pix source=udp:5150
Why does this happen?
If I change the sourcetype to "cisco" in the input stanza, there are no problems.
This is most likely due to the transform that rewrites the host field specifically for the "syslog" sourcetype. It's defined in $SPLUNK_HOME/etc/system/local/transforms.conf
and looks like this:
[syslog-host]
DEST_KEY = MetaData:Host
REGEX = :\d\d\s+(?:\d+\s+|(?:user|daemon|local.?)\.\w+\s+)*\[?(\w[\w\.\-]{2,})\]?\s
FORMAT = host::$1
The reason for having this transform is that it's a pretty common scenario to have Splunk consume data from a syslog receiver that gets its events from loads of different hosts, so often in that case you'll want to have the host field set to where the events originally came from rather than where Splunk happened to read them.
The sourcetype renaming happens after the host renaming, so this transform will take effect even though you have a transform that changes the sourcetype to something else than syslog immediately afterwards.
This is most likely due to the transform that rewrites the host field specifically for the "syslog" sourcetype. It's defined in $SPLUNK_HOME/etc/system/local/transforms.conf
and looks like this:
[syslog-host]
DEST_KEY = MetaData:Host
REGEX = :\d\d\s+(?:\d+\s+|(?:user|daemon|local.?)\.\w+\s+)*\[?(\w[\w\.\-]{2,})\]?\s
FORMAT = host::$1
The reason for having this transform is that it's a pretty common scenario to have Splunk consume data from a syslog receiver that gets its events from loads of different hosts, so often in that case you'll want to have the host field set to where the events originally came from rather than where Splunk happened to read them.
The sourcetype renaming happens after the host renaming, so this transform will take effect even though you have a transform that changes the sourcetype to something else than syslog immediately afterwards.
Fantastic answer! Thanks!!
I should mention that if I set the sourcetype in the input stanza to "cisco" instead of "syslog", the overwrites don't happen.