Our setup is a single search head that goes out to three indexers, with a universal forwarder that sends out to all three via auto load balancing. The universal forwarder's inputs are local files on the box itself... directories using the "[monitor:xxxxxx]" stanza. One particular file (which, based on the monitor stanza in inputs.conf, is sourcetyped to "syslog" explicitly in the stanza) contains logs from a Cisco VPN3000 concentrator. However, the regular syslog extractions don't pull very much valuable info from this file on their own. So, I've created several custom transforms for field extraction, with the corresponding "report" and time-stamping declarations in props.conf. I need to get this one log file sourcetyped to "vpn" in order to apply these. No matter what I try though, I can't seem to get this to work right by creating a "[source:xxxxx]" stanza under props.conf. According to the documentation, this "[source:xxxx]" stanza needs to go in the props.conf on the forwarder in order for the sourcetype to get assigned before it leaves the box and gets sent to the indexers. No dice. I've also tried placing the same thing in the props.conf on each individual indexer, and even the search head (which logically wouldn't make ANY sense) for the heck of it. I CANNOT get it to change the sourcetype on this file from "syslog" to "vpn". Here are the relevant stanzas as they are now:
relevant inputs.conf stanza on the Universal Forwarder (this is the directory under which the file actually sits)--
[monitor:///log/syslog-ng]
disabled = false
host_regex = syslog-ng\/([^\/]*)-\d+-\d+-\d+\.log
sourcetype = syslog
props.conf on the Universal Forwarder (the "*" captures the date stamp in the filename)--
[source::/log/syslog-ng/10.1.1.2-*.log]
sourcetype=vpn
props.conf on the Search Head for dealing with VPN logs --
[vpn]
REPORT-ciscovpn = ciscovpn1,ciscovpn2,ciscovpn3,ciscovpn4
transforms.conf on the Search Head for dealing with VPN logs --
[ciscovpn1]
REGEX = .*\sSEV=\d+ (\w+)/22 RPT=\d+ (\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s+User \[([^ ]+)\] Group \[([^ ]+)\]\s+connected, Session Type: ([^ ]+)
FORMAT = msg_type::$1 src_ip::$2 user::$3 user_group::$4 session_type::$5
[ciscovpn2]
REGEX = .*\sSEV=\d+ (\w+)/184 RPT=\d+ (\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s+Group \[([^ ]+)\] User \[([^ ]+)\] Client Type: ([^ ]+) Client Application Version: ([^ ]+)
FORMAT = msg_type::$1 src_ip::$2 user_group::$3 user::$4 client_type::$5 client_version::$6
[ciscovpn3]
REGEX = .*\sSEV=\d+ (\w+)/28 RPT=\d+ (\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s+User \[([^ ]+)\] Group \[([^ ]+)\] disconnected:\s+Session Type: ([^ ]+)\s+Duration: ([^ ]+)\s+Bytes xmt: (\d+)\s+Bytes rcv: (\d+)\s+Reason: (.*)
FORMAT = msg_type::$1 src_ip::$2 user::$3 user_group::$4 session_type::$5 duration::$6 bytes_out::$7 bytes_in::$8 disconnect_reason::$9
[ciscovpn4]
REGEX = .*\sSEV=\d+ (\w+)/50 RPT=\d+ (\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s+Group \[([^ ]+)\] User \[([^ ]+)\] Connection terminated for peer \w+\. Reason: (.*) Remote Proxy (\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})
FORMAT = msg_type::$1 src_ip::$2 user_group::$3 user::$4 disconnect_reason::$5 remote_proxy::$6
props.conf on the Indexers for time-stamping modification--
[vpn]
TIME_PREFIX = ^\w{3} \d{2} \d{2}\:\d{2}\:\d{2} \d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3} \d+\:\s+
TIME_FORMAT = %m/%d/%Y %H:%M:%S.%3N
I can't get a search for "sourcetype=vpn" to ever return anything, while a search for "source=/log/syslog-ng/10.1.1.2-2011-04-22.log" returns all the expected events... but with sourcetype "syslog" still.
Any idea what I'm doing wrong here? Or is the universal forwarder not capable of applying a sourcetype in this manner (if it's not, the docs REALLY need to state this)? I'm able to apply the transforms and extractions by using "[host::10.1.1.2]" in the props.conf in the Search Head... but we still need the time-stamp modification at index time from the sourcetype stanza on the indexers. I GUESS I could do that also with a "[host::]" block, but we'd really like to get an identifiable sourcetype of "vpn" into all this.
Help with this would really be appreciated, because we've got to get the time-stamping setup right as soon as we can.
An explicit sourcetype in the stanza of inputs.conf should always override the source->sourcetype mapping in props.conf. Removing the inputs.conf configuration will make the props.conf configuration take effect.
Another way to fix this is to add another stanza to inputs.conf to monitor /log/syslog-ng/10.1.1.2-*.log
and add the sourcetype= directive there. As of 4.1, inputs.conf supports configurations like this.
Sorkin, are you sure about that? This documentation (and my own experience based directly on it) indicate otherwise; perhaps the code has changed since you posted your answer? See here:
http://docs.splunk.com/Documentation/Splunk/latest/Data/Advancedsourcetypeoverrides
I downvoted this post because sorkin's answer is the conclusive answer to this user's specific issue. it is also almost always (99.9% of the time) a better idea to set sourcetype in inputs.conf instead of props/transforms.conf.
So, If I have:
inputs.conf
[monitor:///var/log]
props.conf
[source:/var/log/anaconda.log]
sourcetype=anaconda
I'll get auto sourcetyping for everything under /var/log EXCEPT /var/log/anaconda.log I'll get sourcetype=anaconda?
An explicit sourcetype in the stanza of inputs.conf should always override the source->sourcetype mapping in props.conf. Removing the inputs.conf configuration will make the props.conf configuration take effect.
Another way to fix this is to add another stanza to inputs.conf to monitor /log/syslog-ng/10.1.1.2-*.log
and add the sourcetype= directive there. As of 4.1, inputs.conf supports configurations like this.
Yes, the separate stanza should take precedence for inputs that match its pattern. You must also include either the host regex or, more simply, host=10.1.1.2. It will not be inherited. There will not be data duplication. We map each input file to a single stanza. Enabling DEBUG logging of TailingProcessor should help you see what file input is doing in this case.
So, the separate input stanza of [monitor:///log/syslog-ng/10.1.1.2-*.log] sourcetype=vpn
will override the monitor stanza on the entire directory? If that's the case, would I need to also include the host_regex = syslog-ng/([^/]*)-d+-d+-d+.log
under the new stanza as well? Or would that be inherited from the other? Lastly, would this approach cause any data duplication in the indexes?
Thanks for the help, really appreciate it.