Splunk Search

help with regex to separate key/value pairs with a character sequence

tpsplunk
Communicator

I'm having trouble crafting a regex that would pull key=value pairs where the pairs are separated by a character sequence, "+++" for example. I'd like to use a sequence because its way less likely to show up in my log events than a single character delimiter. Each log event is single line. here is a sample:

ts=2011-11-16T21:41:21Z+++aid=1167949209+++ip=1.1.1.2+++g=ir4205+++id=http://xyz.com/not/Import/10+++t="The Best ++Pt. 1 Book:Roland"+++sz=13880020+++pl=X_YZ+++pc=1+++pvid=1001+++rg=jfkdjd+++pid=asdf_1234+++rs=720+++rt=2

i've been testing with a transforms.conf entry like:

REGEX = (\w+)=([^\+]+(?!\+{3}))
FORMAT = $1::$2

but this regex leaves off the last character of every value

I am essentially trying to get around the same issue listed here- which is I started with DELIMS but I can't guarantee that my delimiter won't appear in my log entry and there doesn't appear to be a way to escape the delimiter.
http://splunk-base.splunk.com/answers/3231/escaping-characters-in-an-event

1 Solution

Ayn
Legend

How about

(\w+)=(.+?)\+{3}

?

View solution in original post

tpsplunk
Communicator

I tweaked Ayn's regex a little bit and now it captures the last key=value pair

(\w+)=(.+?)(?:\+{3}|$)

here's another way to do it:

(\w+)=(.+?(?:(?=\+{3})|$))
0 Karma

Ayn
Legend

How about

(\w+)=(.+?)\+{3}

?

View solution in original post

tpsplunk
Communicator

hee hee we got to the same place

0 Karma

Ayn
Legend

Ah, forgot about that case. Well just have the regex match either +++ or end-of-line ($):

(\w+)=(.+?)(?:\+{3}|$)

tpsplunk
Communicator

actually Ayn your regex leaves off the last key=value pair because that last pair is not followed by the delimiter sequence. it is a good solution if i get my dev's to end every log event with the delimiter sequence though.

0 Karma

tpsplunk
Communicator

nice that works! This also works: (\w+)=(.+?(?:(?=+{3})|$))
not sure which is less resource intensive

0 Karma

tpsplunk
Communicator

hmm. i tried making it a code block but it still looks pretty nasty. that whole block should be a single line

0 Karma

dwaddle
SplunkTrust
SplunkTrust

I started to clean up your question and put the event data into a code block so it formats easier to read and you don't have to escape any characters. But, I was afraid I'd mess up the context of your fairly complex event data. You might want to re-paste it and put it in a code block so that folks can decipher it more easily.

0 Karma

tpsplunk
Communicator

note: i purposefully put a couple of "+"'s in the value for key "t" to make sure the regex ignores them (since they aren't 3 consecutive "+"'s)

0 Karma
.conf21 Now Fully Virtual!
Register for FREE Today!

We've made .conf21 totally virtual and totally FREE! Our completely online experience will run from 10/19 through 10/20 with some additional events, too!