Splunk Search

Use of tokenizer option with makemv

agodoy
Communicator

I am trying to break a field based on some regex. Apparently this can be done with the tokenizer option of the makemv command. However, there is no example on how to use it and I keep getting the following error when I try "Error in 'makemv' command: The tokenizer regular expression is invalid"

Basically, I am trying to break on commas(,) that are not followed by a blank space.

End goal: "4,Something" would result in a new value, but "4, Something" would not.

Tags (2)
0 Karma
1 Solution

jonuwz
Influencer

Example :

| gentimes start=-1 
| eval john="1 something,2 something else,3 something, with a comma,4 wibble"
| table john
| makemv tokenizer="(.+?)(?=,\S|$),?" john

What is this? : "(.+?)(?=,\S|$),?"

For the tokenizer to work you need capture groups.

What we're saying here is

(.+?)      grab everything - this is the capture group
(?=,\S|$)  until you get to a comma followed by a non-whitespace, or the end of the line
,?         if there's a comma at the end of the pattern, eat it

result :

alt text

View solution in original post

ckp123
Explorer

As simple replace would do this job.

| replace "," with ", " in john

PS : As per my understood on the requirement

0 Karma

jonuwz
Influencer

Example :

| gentimes start=-1 
| eval john="1 something,2 something else,3 something, with a comma,4 wibble"
| table john
| makemv tokenizer="(.+?)(?=,\S|$),?" john

What is this? : "(.+?)(?=,\S|$),?"

For the tokenizer to work you need capture groups.

What we're saying here is

(.+?)      grab everything - this is the capture group
(?=,\S|$)  until you get to a comma followed by a non-whitespace, or the end of the line
,?         if there's a comma at the end of the pattern, eat it

result :

alt text

View solution in original post

martin_mueller
SplunkTrust
SplunkTrust

This probably works for you:

tokenizer="([^,]*)(,(\s[^,]*,?)*)?"

The tokenizer first captures a value:

([^,]*)`)

and then gobbles up everything that's not a field:

(,(\s[^,]*,?)*)?

PS: As per jonuwz's answer I may have treated ", " badly 🙂

0 Karma
.conf21 CFS Extended through 5/20!

Don't miss your chance
to share your Splunk
wisdom in-person or
virtually at .conf21!

Call for Speakers has
been extended through
Thursday, 5/20!