Splunk Search

help with regex to separate key/value pairs with a character sequence

tpsplunk
Communicator

I'm having trouble crafting a regex that would pull key=value pairs where the pairs are separated by a character sequence, "+++" for example. I'd like to use a sequence because its way less likely to show up in my log events than a single character delimiter. Each log event is single line. here is a sample:

ts=2011-11-16T21:41:21Z+++aid=1167949209+++ip=1.1.1.2+++g=ir4205+++id=http://xyz.com/not/Import/10+++t="The Best ++Pt. 1 Book:Roland"+++sz=13880020+++pl=X_YZ+++pc=1+++pvid=1001+++rg=jfkdjd+++pid=asdf_1234+++rs=720+++rt=2

i've been testing with a transforms.conf entry like:

REGEX = (\w+)=([^\+]+(?!\+{3}))
FORMAT = $1::$2

but this regex leaves off the last character of every value

I am essentially trying to get around the same issue listed here- which is I started with DELIMS but I can't guarantee that my delimiter won't appear in my log entry and there doesn't appear to be a way to escape the delimiter.
http://splunk-base.splunk.com/answers/3231/escaping-characters-in-an-event

1 Solution

Ayn
Legend

How about

(\w+)=(.+?)\+{3}

?

View solution in original post

tpsplunk
Communicator

I tweaked Ayn's regex a little bit and now it captures the last key=value pair

(\w+)=(.+?)(?:\+{3}|$)

here's another way to do it:

(\w+)=(.+?(?:(?=\+{3})|$))
0 Karma

Ayn
Legend

How about

(\w+)=(.+?)\+{3}

?

tpsplunk
Communicator

hee hee we got to the same place

0 Karma

Ayn
Legend

Ah, forgot about that case. Well just have the regex match either +++ or end-of-line ($):

(\w+)=(.+?)(?:\+{3}|$)

tpsplunk
Communicator

actually Ayn your regex leaves off the last key=value pair because that last pair is not followed by the delimiter sequence. it is a good solution if i get my dev's to end every log event with the delimiter sequence though.

0 Karma

tpsplunk
Communicator

nice that works! This also works: (\w+)=(.+?(?:(?=+{3})|$))
not sure which is less resource intensive

0 Karma

tpsplunk
Communicator

hmm. i tried making it a code block but it still looks pretty nasty. that whole block should be a single line

0 Karma

dwaddle
SplunkTrust
SplunkTrust

I started to clean up your question and put the event data into a code block so it formats easier to read and you don't have to escape any characters. But, I was afraid I'd mess up the context of your fairly complex event data. You might want to re-paste it and put it in a code block so that folks can decipher it more easily.

0 Karma

tpsplunk
Communicator

note: i purposefully put a couple of "+"'s in the value for key "t" to make sure the regex ignores them (since they aren't 3 consecutive "+"'s)

0 Karma
Get Updates on the Splunk Community!

What's New in Splunk Cloud Platform 9.2.2403?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.2.2403! Analysts can ...

Stay Connected: Your Guide to July and August Tech Talks, Office Hours, and Webinars!

Dive into our sizzling summer lineup for July and August Community Office Hours and Tech Talks. Scroll down to ...

Edge Processor Scaling, Energy & Manufacturing Use Cases, and More New Articles on ...

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...