Getting Data In

How can I rewrite/add info from metadata to the contents of the raw log line?

Builder

I need to be able to add some information from the Splunk metadata (host and source) into the raw log. I'm looking at using a props/transforms to prepend the host and source from metadata) to the _raw log line (which is hopefully what is forwarded when using splunk to forward to syslog).

eg. Transforming _raw from:

2012-02-08T05:59:02.783+0000: 357044.868: [GC [PSYoungGen: 517888K->1904K(521088K)] 616370K->100386K(2093952K), 0.0038650 secs] [Times: user=0.04 sys=0.00, real=0.00 secs]

To:

myhostname.domain /var/log/gc_throughput.log 2012-02-08T05:59:02.783+0000: 357044.868: [GC [PSYoungGen: 517888K->1904K(521088K)] 616370K->100386K(2093952K), 0.0038650 secs] [Times: user=0.04 sys=0.00, real=0.00 secs]

I can see how to add info to the start of _raw, but as far as I can tell I can only work off a single source field. If I use a SOURCE_KEY of MetaData:Host and DEST_KEY of _raw, I have no record of the original contents of _raw to prepend to. If I use SOURCE_KEY of _raw, obviously I can't capture the hostname.

eg. This will just overwrite _raw with the hostname, losing the rest of the log line, unless I can reference the existing value of _raw somehow.

[GC_add_host_source]
SOURCE_KEY = MetaData:Host
REGEX = (?m)^(.*)$
FORMAT = $1
DEST_KEY = _raw

One idea is to use a SOURCE_KEY and DEST_KEY of _meta and rewrite the whole meta line, which afaik includes the _raw field, but then I need to know what the original meta line looks like, especially the ordering of the meta fields.

Does anyone know how to do this?

Background, if you are interested: Splunk has been tasked with forwarding certain logs to a syslog-ng receiver, where a custom reporting tool runs. However, the syslog-ng server needs to be able to filter the forwarded logs somehow in order to put them into different files on the filesystem. I have managed to get splunk to forward the logs in raw syslog format, but the raw logs do not contain any information with which to filter on. I want to add identifying info to the log line for the syslog-ng receiver to work with.

1 Solution

Builder

I got it working. I had misread the documentation and didn't realise that the $0 variable applied to DEST_KEYs other than _meta.

These are the props.conf and transforms.conf entries that turn this:

2012-02-16T17:24:20.535+0000: 546116.043: [GC [PSYoungGen: 516768K->3008K(519104K)] 610387K->97883K(2091968K), 0.0173060 secs]

Into this:

vrtstspk001.iggroup.local /tmp/gc_throughput.log3 2012-02-16T17:24:20.535+0000: 546116.043: [GC [PSYoungGen: 516768K->3008K(519104K)] 610387K->97883K(2091968K), 0.0173060 secs]

props.conf:

[GC_throughput]
TRANSFORMS-GC_throughput = GC_add_source, GC_add_host

transforms.conf:

[GC_add_host]
SOURCE_KEY = MetaData:Host
REGEX = ^host::(.*)$
FORMAT = $1 $0
DEST_KEY = _raw

[GC_add_source]
SOURCE_KEY = MetaData:Source
REGEX = ^source::(.*)$
FORMAT = $1 $0
DEST_KEY = _raw

It would be even better if I could do both the source and host within the same transform...

View solution in original post

Splunk Employee
Splunk Employee

If you're just wanting to add information to your event entries, just do this when searching your data:


... | eval _raw=host . " - " . _raw

Create an eventtype or macro that will execute that for you when you search your data and you'll have it.

You could then create a saved 'real-time' search that would solicit the data and forward it on with a triggered script.

Not as elegant as forwarding straight _raw data via syslog forwarding. But, it will get you there.

0 Karma

Builder

I got it working. I had misread the documentation and didn't realise that the $0 variable applied to DEST_KEYs other than _meta.

These are the props.conf and transforms.conf entries that turn this:

2012-02-16T17:24:20.535+0000: 546116.043: [GC [PSYoungGen: 516768K->3008K(519104K)] 610387K->97883K(2091968K), 0.0173060 secs]

Into this:

vrtstspk001.iggroup.local /tmp/gc_throughput.log3 2012-02-16T17:24:20.535+0000: 546116.043: [GC [PSYoungGen: 516768K->3008K(519104K)] 610387K->97883K(2091968K), 0.0173060 secs]

props.conf:

[GC_throughput]
TRANSFORMS-GC_throughput = GC_add_source, GC_add_host

transforms.conf:

[GC_add_host]
SOURCE_KEY = MetaData:Host
REGEX = ^host::(.*)$
FORMAT = $1 $0
DEST_KEY = _raw

[GC_add_source]
SOURCE_KEY = MetaData:Source
REGEX = ^source::(.*)$
FORMAT = $1 $0
DEST_KEY = _raw

It would be even better if I could do both the source and host within the same transform...

View solution in original post

Builder

ALSO it is not actually necessary for me to add the host value to the _raw log line for my use case, because the syslog forwarding part of Splunk includes the original hostname (where the log was read from) in the syslog headers. I just need the source path.

0 Karma

Builder

I've managed to see what I think is the entire contents of _meta by writing it to a field with this transform:

[GC_add_host]
SOURCE_KEY = _meta
REGEX = (.*)
FORMAT = $0 testinghost::"$1"
DEST_KEY = _meta

It looks like this:
testinghost=timestartpos::0 timeendpos::28 _subsecond::.842 date_second::24 date_hour::15 date_minute::43 date_year::2012 date_month::february date_mday::16 date_wday::thursday date_zone::0

Which doesnt even include the _raw field. So it looks like my idea to rewrite _raw within _meta will not work, as _raw is not there to rewrite.

0 Karma