Hello,
I have a not ideal log, looking like this, for example:
"field1=value1" "field2=val ue 2" "field3=value3"
And I want to exlude the key-value pairs at index time.
Combinations like the first kv-pair is not problem. The second value however is a problem. With my extraction I can only get the "val" part, and the extraction stops at the whitespace.
My rule in transforms.conf looks like this:
[example]
REGEX = (?<_KEY_1>([^=\"]+)=(?<_VAL_1>([^=\"]+)
To clarify, my results in splunk are looking like this:
field1 = value1
field2 = val
field3 = value3
I am not sure what I am missing.
You have the right regex, but with a slightly muddled syntax (too many left parens). Try this alternative:
[example]
REGEX = ([^=\"]+)=([^=\"]+)
FORMAT = $1::$2
Oh, you are right. I had run so many tests yesterday that it was getting a little confusing.
I've tried both variants, with the same regex:
[example]
REGEX = (?<_KEY_1>([^=\"]+))=(?<_VAL_1>([^=\"]+))
and
[example]
REGEX = ([^=\"]+)=([^=\"]+)
FORMAT = $1::$2
but I am only getting the value to the first whitespace.
"key=val ue"
will result in
key=val
with both variants.
I've tested all changes in props.conf / transforms.conf under etc/system/local to ensure, that there is no other setting that is overwriting my tests.
Maybe I should also mention that these value are embedded in some kind of pseudo json format. I am, however, not using indexed extractions.
The events are looking something like this:
{"severity":"info","time":"123456789","message":"key1=value1" "key2=val ue 2"}
The regex should work. It works in regex101.com. See https://regex101.com/r/ZHCFQp/1
If the key-value pairs are enclosed in parentheses I'd anchor the regex in parentheses as well.