I need to be able to add some information from the Splunk metadata (host and source) into the raw log. I'm looking at using a props/transforms to prepend the host and source from metadata) to the _raw log line (which is hopefully what is forwarded when using splunk to forward to syslog).
eg. Transforming _raw from:
2012-02-08T05:59:02.783+0000: 357044.868: [GC [PSYoungGen: 517888K->1904K(521088K)] 616370K->100386K(2093952K), 0.0038650 secs] [Times: user=0.04 sys=0.00, real=0.00 secs]
To:
myhostname.domain /var/log/gc_throughput.log 2012-02-08T05:59:02.783+0000: 357044.868: [GC [PSYoungGen: 517888K->1904K(521088K)] 616370K->100386K(2093952K), 0.0038650 secs] [Times: user=0.04 sys=0.00, real=0.00 secs]
I can see how to add info to the start of _raw, but as far as I can tell I can only work off a single source field. If I use a SOURCE_KEY of MetaData:Host and DEST_KEY of _raw, I have no record of the original contents of _raw to prepend to. If I use SOURCE_KEY of _raw, obviously I can't capture the hostname.
eg. This will just overwrite _raw with the hostname, losing the rest of the log line, unless I can reference the existing value of _raw somehow.
[GC_add_host_source]
SOURCE_KEY = MetaData:Host
REGEX = (?m)^(.*)$
FORMAT = $1
DEST_KEY = _raw
One idea is to use a SOURCE_KEY and DEST_KEY of _meta and rewrite the whole meta line, which afaik includes the _raw field, but then I need to know what the original meta line looks like, especially the ordering of the meta fields.
Does anyone know how to do this?
Background, if you are interested: Splunk has been tasked with forwarding certain logs to a syslog-ng receiver, where a custom reporting tool runs. However, the syslog-ng server needs to be able to filter the forwarded logs somehow in order to put them into different files on the filesystem. I have managed to get splunk to forward the logs in raw syslog format, but the raw logs do not contain any information with which to filter on. I want to add identifying info to the log line for the syslog-ng receiver to work with.
I got it working. I had misread the documentation and didn't realise that the $0 variable applied to DEST_KEYs other than _meta.
These are the props.conf and transforms.conf entries that turn this:
2012-02-16T17:24:20.535+0000: 546116.043: [GC [PSYoungGen: 516768K->3008K(519104K)] 610387K->97883K(2091968K), 0.0173060 secs]
Into this:
vrtstspk001.iggroup.local /tmp/gc_throughput.log3 2012-02-16T17:24:20.535+0000: 546116.043: [GC [PSYoungGen: 516768K->3008K(519104K)] 610387K->97883K(2091968K), 0.0173060 secs]
props.conf:
[GC_throughput]
TRANSFORMS-GC_throughput = GC_add_source, GC_add_host
transforms.conf:
[GC_add_host]
SOURCE_KEY = MetaData:Host
REGEX = ^host::(.*)$
FORMAT = $1 $0
DEST_KEY = _raw
[GC_add_source]
SOURCE_KEY = MetaData:Source
REGEX = ^source::(.*)$
FORMAT = $1 $0
DEST_KEY = _raw
It would be even better if I could do both the source and host within the same transform...
If you're just wanting to add information to your event entries, just do this when searching your data:
... | eval _raw=host . " - " . _raw
Create an eventtype or macro that will execute that for you when you search your data and you'll have it.
You could then create a saved 'real-time' search that would solicit the data and forward it on with a triggered script.
Not as elegant as forwarding straight _raw data via syslog forwarding. But, it will get you there.
I got it working. I had misread the documentation and didn't realise that the $0 variable applied to DEST_KEYs other than _meta.
These are the props.conf and transforms.conf entries that turn this:
2012-02-16T17:24:20.535+0000: 546116.043: [GC [PSYoungGen: 516768K->3008K(519104K)] 610387K->97883K(2091968K), 0.0173060 secs]
Into this:
vrtstspk001.iggroup.local /tmp/gc_throughput.log3 2012-02-16T17:24:20.535+0000: 546116.043: [GC [PSYoungGen: 516768K->3008K(519104K)] 610387K->97883K(2091968K), 0.0173060 secs]
props.conf:
[GC_throughput]
TRANSFORMS-GC_throughput = GC_add_source, GC_add_host
transforms.conf:
[GC_add_host]
SOURCE_KEY = MetaData:Host
REGEX = ^host::(.*)$
FORMAT = $1 $0
DEST_KEY = _raw
[GC_add_source]
SOURCE_KEY = MetaData:Source
REGEX = ^source::(.*)$
FORMAT = $1 $0
DEST_KEY = _raw
It would be even better if I could do both the source and host within the same transform...
ALSO it is not actually necessary for me to add the host value to the _raw log line for my use case, because the syslog forwarding part of Splunk includes the original hostname (where the log was read from) in the syslog headers. I just need the source path.
I've managed to see what I think is the entire contents of _meta by writing it to a field with this transform:
[GC_add_host]
SOURCE_KEY = _meta
REGEX = (.*)
FORMAT = $0 testinghost::"$1"
DEST_KEY = _meta
It looks like this:
testinghost=timestartpos::0 timeendpos::28 _subsecond::.842 date_second::24 date_hour::15 date_minute::43 date_year::2012 date_month::february date_mday::16 date_wday::thursday date_zone::0
Which doesnt even include the _raw field. So it looks like my idea to rewrite _raw within _meta will not work, as _raw is not there to rewrite.