I have CSV events like this:
f1,f2,{f3a,f3b},f4,{f5a,{f5b1,f5b2,{f5c2a,f5c2b}}},f6
Only certain fields have sub-fields designated by curly-braces (in this case f3, and f5).
I'd like to do a field extraction in transforms.conf that will capture these which should look something like this (which would work if there was no curly-brace nesting at all):
REGEX = ^([^,]*),([^,]*),\{([^\}]*)\},([^,]*),\{([^\}]*)\},([^,]*)
The fly in the ointment is that I do not know the depth of the curly brace nesting so it is not possible to build an explicitly OR'd REGEX.
Does Splunk support .NET style REGEX "balancing groups" or is there some other way to do this (e.g. pass the event to a Python script that does the extraction)?
Does Splunk support .NET style REGEX "balancing groups"
Not exactly, no, but there is the PCRE concept of a "recursive pattern" that can closely emulate it. (Splunk uses PCRE so this should work as well.) A good starting point would be the PCRE documentation, search for "RECURSIVE PATTERNS". There is a usable starting example there.
... is there some other way to do this (e.g. pass the event to a Python script that does the extraction)?
One thing you can try if you are proficient in Python is to write a custom search command.
Note that this is not trivial. You can take a look at some examples in $SPLUNK_HOME/etc/apps/search/bin
. You will specifically want to look at commands like diff.py
or rangemap.py
which perform field-level operations.