topic Re: Use of tokenizer option with makemv in Splunk Search

Use of tokenizer option with makemv

agodoy — Tue, 02 Apr 2013 14:25:24 GMT

I am trying to break a field based on some regex. Apparently this can be done with the tokenizer option of the makemv command. However, there is no example on how to use it and I keep getting the following error when I try "Error in 'makemv' command: The tokenizer regular expression is invalid"

Basically, I am trying to break on commas(,) that are not followed by a blank space.

End goal: "4,Something" would result in a new value, but "4, Something" would not.

Re: Use of tokenizer option with makemv

martin_mueller — Tue, 02 Apr 2013 16:39:10 GMT

This probably works for you:

tokenizer="([^,]*)(,(\s[^,]*,?)*)?"

The tokenizer first captures a value:

([^,]*)`)

and then gobbles up everything that's not a field:

(,(\s[^,]*,?)*)?

PS: As per jonuwz's answer I may have treated ", " badly 🙂

Re: Use of tokenizer option with makemv

jonuwz — Tue, 02 Apr 2013 16:39:27 GMT

Example :

| gentimes start=-1 
| eval john="1 something,2 something else,3 something, with a comma,4 wibble"
| table john
| makemv tokenizer="(.+?)(?=,\S|$),?" john

What is this? : "(.+?)(?=,\S|$),?"

For the tokenizer to work you need capture groups.

What we're saying here is

(.+?)      grab everything - this is the capture group
(?=,\S|$)  until you get to a comma followed by a non-whitespace, or the end of the line
,?         if there's a comma at the end of the pattern, eat it

result :

Re: Use of tokenizer option with makemv

ckp123 — Tue, 26 Nov 2019 05:29:54 GMT

As simple replace would do this job.

| replace "," with ", " in john

PS : As per my understood on the requirement