Splunk Search

Splunk Regex Engine Fails?

morethanyell
Builder

We're trying to extract fields that match this [ FIELD_NAME = S0m3 Valu3 w\ reaLLy $pec!aL ch*rac+3rs ] and write them on tsidx so that their consumable on tstats. We're using the transforms-props partnership below

# transforms.conf
[hello_transforms]
REGEX = (?<key>[\w]+)\s\=\s(?<value>[^\]]+)
FORMAT = $1::$2
REPEAT_MATCH = true
WRITE_META = true

#props.conf
[hello]
DATETIME_CONFIG =
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = true
category = Custom
pulldown_type = 1
TRANSFORMS-capturer = hello_transforms

While it is doing what's expected for most of the fields, (i.e. fields are written on disk, verified through walklex), some values failed to be captured entirely or as expected. For example
[ REMARKS = A Kerberos authentication ticket (TGT) was requested. ]
Splunk only captured "A". See screenshot below.

alt text

REGEX VALID:

alt text

Do you think this is Splunk's REGEX engine's fault or I have something wrong in my configs?

Thanks in advance.

0 Karma

to4kawa
Ultra Champion

Sample:

| makeresults 
| eval _raw="Feb 7 11:25:20 SYD-UTIL-02 ADAuditPlus [ Category = LogonReports ] [ REMARKS = A Kerberos authentication ticket (TGT) was requested. ]"
| rex max_match=0 "\[\s*(?<key>\S+)\s\=\s(?<value>.*?)\]"

transforms.conf

 REGEX = \[\s*(\S+)\s\=\s(.*?)\]

need ]

If you use FORMAT in props.conf , capture name is not need.

Using FORMAT:
REGEX = ([a-z]+)=([a-z]+)
FORMAT = $1::$2

Not using FORMAT:
REGEX = (?<_KEY_1>[a-z]+)=(?<_VAL_1>[a-z]+)

cf. Configureindex-timefieldextraction

0 Karma

morethanyell
Builder

Same result

0 Karma

to4kawa
Ultra Champion

@marethanyell
Do you restart/refresh Splunk?
At least, [ REMARKS = A Kerberos authentication ticket (TGT) was requested. ] is not same result.

0 Karma

morethanyell
Builder

Edited transforms.conf with your regex. Stopped Splunk. Deleted index using "clean eventdata" (don't worry, it's a dev machine). Then restarted Splunk. Re indexed the file using one-shot. Still fails to capture the entire value. It stops at whitespace.

0 Karma

morethanyell
Builder

My old Regex also works on | rex but it does not on transforms.conf

0 Karma

to4kawa
Ultra Champion

@morethanyell
we both have a mistake. my answer is updated.
I'm sorry.

0 Karma

morethanyell
Builder

Same issue, mate. I've used your transforms and it still fails to capture the entire thing and halts at whitespace


[aap_fields_discov]
REGEX = \[\s*(\S+)\s\=\s(.*?)\s\]
REPEAT_MATCH = true
WRITE_META = true

0 Karma

to4kawa
Ultra Champion

(T_T)

sedcmd-whitespace = s/\s/ /g

why REGEX halt with white space?
I don't understand.

0 Karma

morethanyell
Builder

By paper, it should capture this
[ FIELDNAME = The quick brown fox jumps over the lazy dog. ]
If you try it on | rex or on regex101.com, it does work. But when implemented on transforms.conf, it only captures "The"...so, the field value will be "FIELDNAME = The" instead of entire "FIELDNAME = The quick brown fox jumps over the lazy dog."

It's not appropriate anymore to show evidence that the regex is working via | rex or regex101.com because as I've said before, it does work via those mediums. But not when used in transforms.conf for index-time field extraction, it doesn't.

Out of frustration, I've changed the strategy of capturing the fields by enclosing values with double quotes (e.g. [ FIELDNAME = s0m3 vaLu3 ] becomes [ FIELDNAME ="s0m3 vaLu3" ] ) using SEDCMD on props instead of transforms.conf.

Thanks for the help.

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Can’t Make It to Boston? Stream .conf25 and Learn with Haya Husain

Boston may be buzzing this September with Splunk University and .conf25, but you don’t have to pack a bag to ...

Splunk Lantern’s Guide to The Most Popular .conf25 Sessions

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...

Unlock What’s Next: The Splunk Cloud Platform at .conf25

In just a few days, Boston will be buzzing as the Splunk team and thousands of community members come together ...