Splunk Search

Need advice on a complex field extraction

arkadyz1
Builder

I have some data which are of the following format:

CommonPrefix.1.name="Field1",CommonPrefix.1.type="STRING",CommonPrefix.1.status="alive",CommonPrefix.2.name="Field2",CommonPrefix.2.type="NUMBER",CommonPrefix.2.value="3",CommonPrefix.2.status="seen"

etc. I would like to extract some fields so that name above will become a field name and status will become a value. So the data above would yield two extra fields: Field1=alive and Field2=seen. I know that those numbers always go from 1 to 7, and that .name always precedes .status.

I tried to make a transform like this:
In props.conf:

[MySourceType]
TRANSFORMS-myfield1 = transformed_1
...
TRANSFORMS-myfield7 = transformed_7

and in transforms.conf:

[transformed_1]
REGEX = CommonPrefix\.1\.name=”([^”]*)”.*CommonPrefix\.1\.status=”([^”]*)”
FORMAT = $1::$2
LOOKAHEAD= 1048576
...
[transformed_7]
REGEX = CommonPrefix\.7\.name=”([^”]*)”.*CommonPrefix\.7\.status=”([^”]*)”
FORMAT = $1::$2
LOOKAHEAD= 1048576

I'm using LOOKAHEAD because my data are quite long. I tried to use _KEY_1 + _VAL_1 capturing groups as well, instead of or in addition to FORMAT. Nothing worked - the fields are not extracted.

Any ideas on what to fix here?

0 Karma
1 Solution

adamsaul
Communicator

arkadyz1,

Try this reg-ex:

(?:CommonPrefix\.1\.name=\")(\w*)(?:\")(?:.*)(?:CommonPrefix\.1\.status=\")(\w*)(?:\")

View solution in original post

MuS
Legend

Hi arkadyz1,

Your regex would work! But you have a format issue; your double quotes are windownized and therefore wrong 😉

This is working:

 CommonPrefix\.1\.name="([^"]*)".*CommonPrefix\.1\.status="([^"]*)"

This is not working:

 CommonPrefix\.1\.name=”([^”]*)”.*CommonPrefix\.1\.status=”([^”]*)”

Hope this helps ...

cheers, MuS

0 Karma

arkadyz1
Builder

The quotes are fine in transforms.conf, it's just this site that windownized them. So no, it's not that. I tried escaping them with backslashes, which also didn't work.

0 Karma

MuS
Legend

Your regex works on your provided sample event see http://pasteboard.co/gzVlDIRjH.png :

alt text

Make sure your sourcetype matches, you placed the props.conf on the parsing Splunk instance and restarted splunk afterwards.

0 Karma

arkadyz1
Builder

I added capturing groups as suggested by adamsaul in the accepted answer and it started working. I also escaped double quotes with backslashes but I tried that before. Really strange...

0 Karma

MuS
Legend

Of course facepalm - good spotting in this case!

0 Karma

adamsaul
Communicator

arkadyz1,

Try this reg-ex:

(?:CommonPrefix\.1\.name=\")(\w*)(?:\")(?:.*)(?:CommonPrefix\.1\.status=\")(\w*)(?:\")

adamsaul
Communicator

The above is assuming you do not want to keep the surrounding " 's

0 Karma

arkadyz1
Builder

I'm not sure why adding capturing groups worked, but it did. Really weird...

0 Karma

adamsaul
Communicator

Technically you have capturing groups as well, but I also used non-capturing groups so that Splunk doesn't interpret any other data (not that it should).

Glad it worked for you!

Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.
Get Updates on the Splunk Community!

Tech Talk Recap | Mastering Threat Hunting

Mastering Threat HuntingDive into the world of threat hunting, exploring the key differences between ...

Observability for AI Applications: Troubleshooting Latency

If you’re working with proprietary company data, you’re probably going to have a locally hosted LLM or many ...

Splunk AI Assistant for SPL vs. ChatGPT: Which One is Better?

In the age of AI, every tool promises to make our lives easier. From summarizing content to writing code, ...