Splunk Search

Need advice on a complex field extraction

arkadyz1
Builder

I have some data which are of the following format:

CommonPrefix.1.name="Field1",CommonPrefix.1.type="STRING",CommonPrefix.1.status="alive",CommonPrefix.2.name="Field2",CommonPrefix.2.type="NUMBER",CommonPrefix.2.value="3",CommonPrefix.2.status="seen"

etc. I would like to extract some fields so that name above will become a field name and status will become a value. So the data above would yield two extra fields: Field1=alive and Field2=seen. I know that those numbers always go from 1 to 7, and that .name always precedes .status.

I tried to make a transform like this:
In props.conf:

[MySourceType]
TRANSFORMS-myfield1 = transformed_1
...
TRANSFORMS-myfield7 = transformed_7

and in transforms.conf:

[transformed_1]
REGEX = CommonPrefix\.1\.name=”([^”]*)”.*CommonPrefix\.1\.status=”([^”]*)”
FORMAT = $1::$2
LOOKAHEAD= 1048576
...
[transformed_7]
REGEX = CommonPrefix\.7\.name=”([^”]*)”.*CommonPrefix\.7\.status=”([^”]*)”
FORMAT = $1::$2
LOOKAHEAD= 1048576

I'm using LOOKAHEAD because my data are quite long. I tried to use _KEY_1 + _VAL_1 capturing groups as well, instead of or in addition to FORMAT. Nothing worked - the fields are not extracted.

Any ideas on what to fix here?

0 Karma
1 Solution

adamsaul
Communicator

arkadyz1,

Try this reg-ex:

(?:CommonPrefix\.1\.name=\")(\w*)(?:\")(?:.*)(?:CommonPrefix\.1\.status=\")(\w*)(?:\")

View solution in original post

MuS
SplunkTrust
SplunkTrust

Hi arkadyz1,

Your regex would work! But you have a format issue; your double quotes are windownized and therefore wrong 😉

This is working:

 CommonPrefix\.1\.name="([^"]*)".*CommonPrefix\.1\.status="([^"]*)"

This is not working:

 CommonPrefix\.1\.name=”([^”]*)”.*CommonPrefix\.1\.status=”([^”]*)”

Hope this helps ...

cheers, MuS

0 Karma

arkadyz1
Builder

The quotes are fine in transforms.conf, it's just this site that windownized them. So no, it's not that. I tried escaping them with backslashes, which also didn't work.

0 Karma

MuS
SplunkTrust
SplunkTrust

Your regex works on your provided sample event see http://pasteboard.co/gzVlDIRjH.png :

alt text

Make sure your sourcetype matches, you placed the props.conf on the parsing Splunk instance and restarted splunk afterwards.

0 Karma

arkadyz1
Builder

I added capturing groups as suggested by adamsaul in the accepted answer and it started working. I also escaped double quotes with backslashes but I tried that before. Really strange...

0 Karma

MuS
SplunkTrust
SplunkTrust

Of course facepalm - good spotting in this case!

0 Karma

adamsaul
Communicator

arkadyz1,

Try this reg-ex:

(?:CommonPrefix\.1\.name=\")(\w*)(?:\")(?:.*)(?:CommonPrefix\.1\.status=\")(\w*)(?:\")

View solution in original post

adamsaul
Communicator

The above is assuming you do not want to keep the surrounding " 's

0 Karma

arkadyz1
Builder

I'm not sure why adding capturing groups worked, but it did. Really weird...

0 Karma

adamsaul
Communicator

Technically you have capturing groups as well, but I also used non-capturing groups so that Splunk doesn't interpret any other data (not that it should).

Glad it worked for you!


Tune In & Win!

Don't miss out on your
chance to take home free
prizes by helping our players
save the Splunk Cloudom!

Dungeons & Data
Monsters: Splunk O11y
Day Editions Games
stream live:
5/4 at 6:30pm PST
5/5 at 7:00pm PST
on