Getting Data In

field extraction based on "=>"

ebailey
Communicator

I am consuming facter data from Puppet in the format of "virtual => vmware". Each host has on average 180 unique data points all in the same format with different field names.

The format is almost a key value pair, but not quite. Is it possible to have Splunk extract the fields for Puppet facts i.e. virtual => vmware so that the field name is virtual and the value is vmware without having to define a field extraction for every single field? Would it make sense to use sed in props to remove ">" to help Splunk automatically extract the field at search time?

Thanks!

0 Karma
1 Solution

lguinn2
Legend

Using sed in props.conf is not a bad idea, but it is better to use search time field extractions. You could use props.conf with transforms.conf to do the search time field extractions, like so:

props.conf

[yourpuppetsourcetype]
KV_MODE = none
REPORT-yourpuppetsourcetype=extract-puppet-fields

transforms.conf

[extract-puppet-fields]
REGEX  = (\S+?)=>(\S+?)\s+
FORMAT = $1::$2

Don't use the KV_MODE setting if you have some fields that can be auto-extracted (name=value). The REGEX above will extract name=>value pairs, but it will not properly recognize values that have embedded whitespace.

View solution in original post

lguinn2
Legend

Using sed in props.conf is not a bad idea, but it is better to use search time field extractions. You could use props.conf with transforms.conf to do the search time field extractions, like so:

props.conf

[yourpuppetsourcetype]
KV_MODE = none
REPORT-yourpuppetsourcetype=extract-puppet-fields

transforms.conf

[extract-puppet-fields]
REGEX  = (\S+?)=>(\S+?)\s+
FORMAT = $1::$2

Don't use the KV_MODE setting if you have some fields that can be auto-extracted (name=value). The REGEX above will extract name=>value pairs, but it will not properly recognize values that have embedded whitespace.

ebailey
Communicator

I removed the KV_MODE=none and changed the regex to

REGEX = (\S+)\s=>\s(\S+)

That works just fine. Thanks for pointing me in the right direction.

Ed

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to November Tech Talks, Office Hours, and Webinars!

What are Community Office Hours? Community Office Hours is an interactive 60-minute Zoom series where ...

Index This | When is October more than just the tenth month?

October 2025 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Observe and Secure All Apps with Splunk

  Join Us for Our Next Tech Talk: Observe and Secure All Apps with SplunkAs organizations continue to innovate ...