While ingesting a data source that comes in over syslog with a basic structure of syslog header key="value",key="value",key="value"
etc. we run into an issue when using KV_MODE = auto_escaped
. One of the fields actually contains =
signs in rare cases, which causes splunk to take what is on the left of that =
as a fieldname and what is on the right as a value. Basically, it extracts 'fake' key value pairs nested in values.
Now, I can imagine this is as designed and just how auto kv behaves. Without giving Splunk more concrete instructions, this is as good as it gets.
But when investigating alternative extraction methods, that do not rely on auto kv, I noticed that when I apply the kv
command in the search bar, with pairdelim="," kvdelim="="
, it still extracts 'nested fields' when it encounters =
signs in the field values. I was expecting that by explicitly defining the pairdelim, it would not do that anymore. It even extracts multiple nested fields from the same value, without the pairdelim being present there.
Run anywhere example to reproduce the situation:
| makeresults
| eval _raw="a=\"123\",b=\"abc test = yada foo abc=456 bar.docx\",c=\"bla\""
| kv pairdelim="," kvdelim="="
Note how this returns fields test=yada
and abc=456
.
PS: when replacing the ,
in the generated _raw field with a space or ;
, the result is the same. So what is the purpose of that pairdelim parameter then?
Are my expectations simply wrong, or is this a bug somehow? And would a DELIMS based transform extraction behave the same as the kv
command?
So, we tried using a DELIMS = ",", "="
transform, and that does work as expected, without extracting any extra fields.
Still wondering why the behavior of | kv pairdelim="," kvdelim="="
is different.