Splunk Search

Splunk Regex Engine Fails?

morethanyell
Builder

We're trying to extract fields that match this [ FIELD_NAME = S0m3 Valu3 w\ reaLLy $pec!aL ch*rac+3rs ] and write them on tsidx so that their consumable on tstats. We're using the transforms-props partnership below

# transforms.conf
[hello_transforms]
REGEX = (?<key>[\w]+)\s\=\s(?<value>[^\]]+)
FORMAT = $1::$2
REPEAT_MATCH = true
WRITE_META = true

#props.conf
[hello]
DATETIME_CONFIG =
LINE_BREAKER = ([\r\n]+)
NO_BINARY_CHECK = true
category = Custom
pulldown_type = 1
TRANSFORMS-capturer = hello_transforms

While it is doing what's expected for most of the fields, (i.e. fields are written on disk, verified through walklex), some values failed to be captured entirely or as expected. For example
[ REMARKS = A Kerberos authentication ticket (TGT) was requested. ]
Splunk only captured "A". See screenshot below.

alt text

REGEX VALID:

alt text

Do you think this is Splunk's REGEX engine's fault or I have something wrong in my configs?

Thanks in advance.

0 Karma

to4kawa
Ultra Champion

Sample:

| makeresults 
| eval _raw="Feb 7 11:25:20 SYD-UTIL-02 ADAuditPlus [ Category = LogonReports ] [ REMARKS = A Kerberos authentication ticket (TGT) was requested. ]"
| rex max_match=0 "\[\s*(?<key>\S+)\s\=\s(?<value>.*?)\]"

transforms.conf

 REGEX = \[\s*(\S+)\s\=\s(.*?)\]

need ]

If you use FORMAT in props.conf , capture name is not need.

Using FORMAT:
REGEX = ([a-z]+)=([a-z]+)
FORMAT = $1::$2

Not using FORMAT:
REGEX = (?<_KEY_1>[a-z]+)=(?<_VAL_1>[a-z]+)

cf. Configureindex-timefieldextraction

0 Karma

morethanyell
Builder

Same result

0 Karma

to4kawa
Ultra Champion

@marethanyell
Do you restart/refresh Splunk?
At least, [ REMARKS = A Kerberos authentication ticket (TGT) was requested. ] is not same result.

0 Karma

morethanyell
Builder

Edited transforms.conf with your regex. Stopped Splunk. Deleted index using "clean eventdata" (don't worry, it's a dev machine). Then restarted Splunk. Re indexed the file using one-shot. Still fails to capture the entire value. It stops at whitespace.

0 Karma

morethanyell
Builder

My old Regex also works on | rex but it does not on transforms.conf

0 Karma

to4kawa
Ultra Champion

@morethanyell
we both have a mistake. my answer is updated.
I'm sorry.

0 Karma

morethanyell
Builder

Same issue, mate. I've used your transforms and it still fails to capture the entire thing and halts at whitespace


[aap_fields_discov]
REGEX = \[\s*(\S+)\s\=\s(.*?)\s\]
REPEAT_MATCH = true
WRITE_META = true

0 Karma

to4kawa
Ultra Champion

(T_T)

sedcmd-whitespace = s/\s/ /g

why REGEX halt with white space?
I don't understand.

0 Karma

morethanyell
Builder

By paper, it should capture this
[ FIELDNAME = The quick brown fox jumps over the lazy dog. ]
If you try it on | rex or on regex101.com, it does work. But when implemented on transforms.conf, it only captures "The"...so, the field value will be "FIELDNAME = The" instead of entire "FIELDNAME = The quick brown fox jumps over the lazy dog."

It's not appropriate anymore to show evidence that the regex is working via | rex or regex101.com because as I've said before, it does work via those mediums. But not when used in transforms.conf for index-time field extraction, it doesn't.

Out of frustration, I've changed the strategy of capturing the fields by enclosing values with double quotes (e.g. [ FIELDNAME = s0m3 vaLu3 ] becomes [ FIELDNAME ="s0m3 vaLu3" ] ) using SEDCMD on props instead of transforms.conf.

Thanks for the help.

0 Karma
Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Index This | What travels the world but is also stuck in place?

April 2026 Edition  Hayyy Splunk Education Enthusiasts and the Eternally Curious!   We’re back with this ...

Discover New Use Cases: Unlock Greater Value from Your Existing Splunk Data

Realizing the full potential of your Splunk investment requires more than just understanding current usage; it ...

Continue Your Journey: Join Session 2 of the Data Management and Federation Bootcamp ...

As data volumes continue to grow and environments become more distributed, managing and optimizing data ...