Getting Data In

Different sourcetypes for different syslog hosts?

mileserickson
Engager

Scenario:

  1. Multiple hosts send syslog data to the Splunk server on UDP port 514
  2. I want to be able to parse each host's data in a unique way
  3. Generally, I am not allowed to send syslog data on a non-standard port
  4. Port 514 is configured to have a sourcetype of "syslog"

One of the hosts sending syslog data is a Barracuda Web Filter. I would like to be able to map field names to the values in the space-delimited syslog entries that it generates. But, it looks like this is done in transforms.conf by sourcetype, and I don't want to apply my Barracuda-specific field mappings to every host that sends syslog data on UDP port 514.

Am I expected to define a special sourcetype for the Barracuda? If so, how do I assign the sourcetype via hostname (or some other identifying characteristic) instead of just by port number?

Update:

I tried creating etc/system/local/props.conf with the following contents, then restarting splunkd. It seems to have had no effect:

[host::barracuda-hostname.domain]
sourcetype = barracuda
Tags (3)

Jason
Motivator

Yes, there is a way to do this. The important caveat here is that if you are using the "syslog" sourcetype, "host" is getting extracted from the message and forced - but this is at the same time you are also trying to force the sourcetype. Splunk doesn't know of this change yet, so you need to use the original host, sourcetype or source:

--props.conf--
[syslog]    <-- important part. host=barracuda hasn't been set yet, so use syslog or the hostname of the forwarder
TRANSFORMS-force_st_for_barracuda = force_barracuda_st

--transforms.conf--
[force_barracuda_st]  
DEST_KEY = MetaData:Sourcetype
REGEX = (barracuda-hostname.domain|bar.rac.uda.ip) <-- some unique string that only appears in Barracuda events
FORMAT = sourcetype::barracuda

lakromani
Builder

This was nice. Just what I was looking for and gives a great way to separate Syslog input. Thanks.

0 Karma

Lowell
Super Champion

As Felix mentioned, routing to different log files is a nice approach. There are many options here, it's all about finding the one that makes the most sense in your situation.

We use syslog-ng running on our central splunk indexer. We listen on a couple of different IP address (we use one IP for normal syslog stuff, and the other is used for syslog events coming from cisco network devices or from our firewall). Sending the data on two different IP addresses allows us to use the standard syslog port and if volume someday goes up we can split out the work onto separate boxes. From there we use a bunch of syslog-ng rules to place the content into different logs. Some of this is done by simple syslog filtering logic, and some of it uses host filtering and regex matching. But in the end, syslog-ng writes out basically 1 file per sourcetype. (I say "basically", because in some cases I found it helpful to split the log files based on severity level, which then becomes part of the log name -- and then I setup a field extraction in splunk; which is nice when you want to only look at the more serious events.)


BTW, have you tried setting up field extraction directly against your host?

props.conf:

[host::barracuda-hostname.domain]
EXTRACT-fields = ^S+\S+(?<field1>\S+) ...

If this is the only kind of events that are coming from that host, then doing a search-time field extraction should be an efficient option.

gkanapathy
Splunk Employee
Splunk Employee

The [host::hostname] will only work if it references the hostname that is seen when the event arrives in to Splunk. If the sourcetype of data is syslog, there is a built-in transform that extracts and sets the host field from the raw event data and is what you'll see in Splunk when searching. So, it is important to know what the host value is prior to it being transformed. You perhaps do this by disabling the transform, or using some sourcetype temporarily that does not have that transform.

0 Karma

mileserickson
Engager

Very interesting. I will try disabling the transform.

0 Karma

ftk
Motivator

I recommend to write the syslog messages to disk with syslogd or Kiwi syslog daemon, then indexing the log files, instead of sending it straight to Splunk.

This way you can easily assign different extractions to the different syslog streams based on source rather than sourcetype. There are some answers that deal with setups like this on Windows: http://answers.splunk.com/questions/5111/best-way-to-write-syslog-to-a-file-on-windows

And a wiki entry about setting this up on Linux: http://www.splunk.com/wiki/Deploy:Best_Practice_For_Configuring_Syslog_Input

0 Karma
.conf21 Now Fully Virtual!
Register for FREE Today!

We've made .conf21 totally virtual and totally FREE! Our completely online experience will run from 10/19 through 10/20 with some additional events, too!