Splunk Search

How to extract key, field name, and value with regex?

tcmarquesi
Explorer

I'm wondering if somebody had faced this freaking behavior.

I wanna extract both key, the field name, and its value from my (pretty uncommon) log and, in order to this I did the following:

In first place I made the search bellow just to test the regex, and it's working perfectly.

... | rex max_match=0 field=_raw "(?<test1>\w+)\(.+\)=(?<test2>[^\(].*)[\n|\r]"

I then replaced the test1 and test2 tags by _KEY_1 and _VAL_1 to assign properly each matched group to key and value as I wanted.

... | rex max_match=0 field=_raw "(?<_KEY_1>\w+)\(.+\)=(?<_VAL_1>[^\(].*)[\n|\r]"

From here ahead the extraction didn't work anymore.

So, had someone handled successfully same problem using this _KEY_1 and _VAL_1 tags? It seems like a bug for me.

Thanks in advance,

Tiago

0 Karma

irenefdezbb
Observer

Maybe, not working with _KEY_1 and _VALUE_1 because of splunk reserves the fields beginning with _ for your own settings, if I remember correctly.

0 Karma

aguthrie1190
Path Finder

Late to the party here, but I had a similar need to this and saw that this question hadn't been answered. Basically do your extractions, then use {} in an eval to have a variable fieldname.

| gentimes start=-2
| eval _raw="extract"+starttime+" this"+endtime
| rex field=_raw "(?<field_name>extract[0-9]+)\s(?<field_value>this[0-9]+)"
| eval {field_name}=field_value

Then if you care, you can get rid of the placeholder fields:

| gentimes start=-2
| fields - *human
| eval _raw="extract"+starttime+" this"+endtime
| rex field=_raw "(?<field_name>extract[0-9]+)\s(?<field_value>this[0-9]+)"
| eval {field_name}=field_value
| fields - field_name field_value

These searches should run anywhere. The idea came from here https://answers.splunk.com/answers/103700/how-do-i-create-a-field-whose-name-is-the-value-of-another....

Tags (1)

tcmarquesi
Explorer

Just to stay everybody in the same page, using "_" is not a problem, indeed both _KEY_foo and _VAL_bar are reserved tags in order to allow splunk find the field name a its value into the text, as in docs.

http://docs.splunk.com/Documentation/Splunk/6.5.0/Data/Configureindex-timefieldextraction#Add_a_rege...

0 Karma

snoobzilla
Builder

Yes, I have done this, not with a variable delimiter, but I think a field transform will work.

I used this for logs with ]:[ key-value delimiter and ] [ as pair delimiter, e.g. [KEY1]:[VALUE1] [KEY2]:[VALUE2] [KEY3.....

From webui for example above...

Create Transform...

Fields-->Field Transformations--New
Regular Expression: \[([a-zA-Z0-9_]*?)\]\:\[([^\]]*?)\]
Source Key: _raw
Format: $1::$2

Create Extract
Then create new field extract, choose Type of transform, and point to the transform you created.

Tip: use regex101.com or equivalent to test your regex... it will work there and in transform but I get errors using this inline.

tcmarquesi
Explorer

I'd done this but through transforms.conf. Indeed I can see my stanza through UI.

About the regex, I tested it exhaustively in both regex101.com and regexr.com/v1, and it's working perfectly.

0 Karma

snoobzilla
Builder

Did you try method above with your rex without named capturing groups at all?... e.g.

(\w+)\(.+\)=([^\(].*)[\n|\r]

Note the Format field in transform: $1::$2

0 Karma

tcmarquesi
Explorer

Yes, I did. It was my starting point.

This issue really seems as a bug for me...

0 Karma

snoobzilla
Builder

Bummer. You may be right, may be a limitation.

Assume you saw this... https://answers.splunk.com/answers/133561/multiple-key-value-pair-extraction.html

Good luck.

0 Karma

tcmarquesi
Explorer

Thanks all help. 🙂

0 Karma

snoobzilla
Builder

Or maybe

(\w+?)\(.+\)=([^\(].*?)[\n|\r]  
0 Karma

tcmarquesi
Explorer

Just few additional comments:

I need to use regex because my log is a little unusual, it can't be automatically parsed.

I don't want to change my log with sed or something like that, is important to me keep it original.

In fact I intend to implement it in transforms.conf. I made the question using the SPL search because it behaved equally and it's easier to be reproduced.

Regards,

Tiago

0 Karma

snoobzilla
Builder

You need to do this using a field transform and reference that transform in a field extraction. I can get these working on regex101.com but have not had luck using them inline.

See https://answers.splunk.com/answers/126754/transforms-field-value-extract-not-fully-working.html

0 Karma

sundareshr
Legend

Splunk regex does not like _ in field names. Having said that, have your looked at the extract command, that may be a better options.

... | extract kvdelim="=" pairdelim="\n"

http://docs.splunk.com/Documentation/Splunk/6.5.0/SearchReference/Extract

tcmarquesi
Explorer

Thanks, but you missed my log is not that simple. Between key and value there is some text like "(foo 12)=". So I have to use regex, extract is ineffective.

0 Karma

rjthibod
Champion

Leading underscores on field names is a no-no. Splunk uses leading underscores on field names for special / hidden fields.

Try renaming your fields to something with no leading underscore.

0 Karma

rjthibod
Champion

Here is a link with more details about internal fields. http://docs.splunk.com/Documentation/Splunk/6.5.1/Knowledge/Usedefaultfields

0 Karma
Get Updates on the Splunk Community!

Enterprise Security Content Update (ESCU) v3.54.0

The Splunk Threat Research Team (STRT) recently released Enterprise Security Content Update (ESCU) v3.54.0 and ...

Using Machine Learning for Hunting Security Threats

WATCH NOW Seeing the exponential hike in global cyber threat spectrum, organizations are now striving more for ...

New Learning Videos on Topics Most Requested by You! Plus This Month’s New Splunk ...

Splunk Lantern is a customer success center that provides advice from Splunk experts on valuable data ...