Getting Data In

How to configure props and transforms.conf to rename a dynamic set of field names at search-time?

goodsellt
Contributor

Hello!

I'm struggling to understand how I can use the transforms.conf stanza's to rename dynamic set of field names, ideally using the output of of a separate extraction (or just a regex which may be able to match it).

My problem more specifically is, I receive a series of events from a server which are key'd using the action being performed; the data looks like the following:

name=steve host=xxx download.doc_name=myfile.doc download.doc_id=12345 download.doc_owner=jeff

name=jeff host=yyy rename.new_doc_name=renamed.xls rename.old_doc_name=original.xls rename.doc_owner=jeff

and so on, essentially events have static metadata like the user performing the action, and information about the system they are on as regular key=value, however, when it comes to the action taken, anything relating to the action is stored as (action.key)=value.

What I'm looking for is a way to use the transforms and props stanzas to dynamically modify the fields so we can use them in searches as if they were like the following:

name=steve host=xxx action=download doc_name=myfile.doc doc_id=12345 doc_owner=jeff

name=jeff host=yyy action=rename new_doc_name=renamed.xls old_doc_name=original.xls doc_owner=jeff

I'm able to get the action= item easily enough, however, I can't seem to find any way to then strip the action & period from the front of the rest of the other key=value pairs. I'm unable to just do this statically, as there are far more actions, with each having potentially their own individual key=value pairs (such as new_doc_name & old_doc_name for rename actions).

Ideally I'd like to be able to use this transform with a few different sourcetypes which have similar formatting to this (but we keep separate based on some other factors).

Anyone with more experience know if doing something like I mentioned is possible? Please note that trying to change the data before it indexes into Splunk is not currently an option.

0 Karma
1 Solution

goodsellt
Contributor

Just wanted to say I was able to solve this by using multiple report stanzas:

props:

[sourcetype-ex]
Report-test = stanza1, stanza2

transforms:

[stanza1]
REGEX = (\w+)\\.action\=
FORMAT = action::$1

[stanza2]
REGEX = \w+\\.(\w+)\=(\"[^"]+\"|\S+)
FORMAT = $1::$2

View solution in original post

0 Karma

goodsellt
Contributor

Just wanted to say I was able to solve this by using multiple report stanzas:

props:

[sourcetype-ex]
Report-test = stanza1, stanza2

transforms:

[stanza1]
REGEX = (\w+)\\.action\=
FORMAT = action::$1

[stanza2]
REGEX = \w+\\.(\w+)\=(\"[^"]+\"|\S+)
FORMAT = $1::$2
0 Karma

goodsellt
Contributor

I'm struggling understanding why the transforms.conf file seems to be rather ineffective compared with a pure EXTRACT in the props file:

my transform.conf stanza is:

    [get_action]
    REGEX = (\w+)\.action\=    
    FORMAT = action::$1

inside props I have:

[custom_sourcetype]
...standard stuff...
REPORT-action = get_action

However it's only returning 5% coverage across my events.

If I just do inside props:

[custom_sourcetype]
....standard stuff...
EXTRACT-action = (?<action>\w+)\.action\=

I get 100% coverage in my events.

Can anyone explain this behavior?

0 Karma

woodcock
Esteemed Legend

Just add this to the end:

... | foreach *.* [ rename $<<FIELD>>$ AS "<<MATCHSEG2>>" ]

Or this:

... | foreach *.* [ eval  "<<MATCHSEG2>>"=$<<FIELD>>$]
0 Karma

goodsellt
Contributor

I'm looking to do this with the props & transforms fields, not with actual search commands, though thanks for the suggestions.

0 Karma

goodsellt
Contributor

I'd also like to point out I want to avoid just doing a regex which just matches everything after the period, as I'm worried that there may be instances where a legitimate period would appear in the key name or possibly in one of the values (such as IP address) and it would cause problems because of that.

0 Karma

somesoni2
Revered Legend

Assuming you've configuration to get key and value separated (means you're able to get new_doc_name=abc.xls old_doc_name=xyz.xls ), then (assuming action is same for all fields in the event) you could just extract action using a EXTRACT attribute in props.conf.

props.conf

[yoursourcetype]
EXTRACT-action = \s+(?<Action>[^\.]+)\S+=
0 Karma

goodsellt
Contributor

Thanks for this!

I do already have the Extract for the action, however my issue now lies with how can I then remove action from the front of the key=value pairs where it exists, and without breaking my original extract (not having to modify any of the _raw data).

0 Karma

somesoni2
Revered Legend

How many possible values of action you can have (just the action)?

0 Karma

goodsellt
Contributor

I'm not positive, it seems like there is around 16 or so but there may be actions which I have not seen yet due to their rarity, as well as any potential new actions which may be added into this data set from new sources.

0 Karma

somesoni2
Revered Legend

Yup, maintenance and accuracy will be an issue. If it was not, they you could have created (along with field extraction for action) a FIELDALIAS entries in props.conf for each possible action

 [yoursourcetype]
FIELDALIAS-alias = download.* AS * rename.* AS * ....
0 Karma
Get Updates on the Splunk Community!

Notification Email Migration Announcement

The Notification Team is migrating our email service provider from Postmark to AWS Simple Email Service (SES) ...

Mastering Synthetic Browser Testing: Pro Tips to Keep Your Web App Running Smoothly

To start, if you're new to synthetic monitoring, I recommend exploring this synthetic monitoring overview. In ...

Splunk Edge Processor | Popular Use Cases to Get Started with Edge Processor

Splunk Edge Processor offers more efficient, flexible data transformation – helping you reduce noise, control ...