Splunk Search

Delimited field extracts always result in rex errors

JosephHobbs
Path Finder

I recently started trying to set up some field extracts for a few of our events.  In this case, the logs are pipe delimited and contain only a few segments.  What I've found that most of these attempts result in an error with rex regarding limits in limits.conf.

For example: this record:

2022-02-03 11:45:21,732 |xxxxxxxxxxxxxxx.xxxxxx.com~220130042312|<== conn[SSL/TLS]=274107 op=26810 MsgID=26810 SearchResult {resultCode=0, matchedDN=null, errorMessage=null} ### nEntries=1 ### etime=3 ###

When I attempt to use a pipe delimited field extract (for testing) the result is this error:

JosephHobbs_0-1643909429403.png

When I toss this regex (from the error) into regex101 (https://regex101.com/r/IswlNh/1) it tells me it requires 2473 steps, which is well above the default 1000 for depth_limit...  How is it that an event with 4 segments delimited by pipe is so bad?

I realize there are 2 limits (depth_count/match_count) in play here and I can increase them, but nowhere can I find recommended values to use as a sanity check.  I also realize I can optimize the regex, but as I am setting this up via UI using the delimited option, I don't have access to the regex at creation time.  Not to mention, many of my users are using this option as they are not regex gurus...

So my big challenge/question is...  Where do I go from here?  My users are going to use this delimited options, which evidently generates some seriously inefficient regex under the covers.  Do I increase my limit(s), and if so what is a sane/safe value?  Is there something I'm missing?

Thanks!

Labels (3)
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Can you use the fact that pipes are the delimiter character?

^(?P<field1>[^\|]+)\s\|(?P<field2>[^\|]+)\|(?P<field3>.*)

https://regex101.com/r/MLYmkL/1

 

0 Karma

JosephHobbs
Path Finder

No doubt the regex can be improved significantly as you demonstrated.  I guess my challenge is, how do I tell my users that the OOB delimited option just doesn't work and that they now have to go learn regex to extract their fields?

At the end of the day, I see 3 possibilities...

  • I'm doing something wrong...
  • The default limits are just too low and I should increase them (to what?)...
  • Splunk's delimited parsing UI just generates really inefficient regex..

 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

If you don't want to use rex, you could use makemv

| makeresults
| eval _raw="2022-02-03 11:45:21,732 |xxxxxxxxxxxxxxx.xxxxxx.com~220130042312|<== conn[SSL/TLS]=274107 op=26810 MsgID=26810 SearchResult {resultCode=0, matchedDN=null, errorMessage=null} ### nEntries=1 ### etime=3 ###"
| makemv _raw delim="|"
| eval field1=mvindex(_raw,0)
| eval field2=mvindex(_raw,1)
| eval field3=mvindex(_raw,2)

 I think the issue with your rex is there are a few greedy matches so it keeps restarting the matches hence the high number of steps.

0 Karma

JosephHobbs
Path Finder

The point here being, that's not my regex.  That was generated by the Splunk UI when I tried to create a field extract using 'delimiting with pipe'...  The only reason I have that regex in hand is because the error message included it...

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

OK, I see. Sometimes, splunk is too clever for its own good! 😀

0 Karma

JosephHobbs
Path Finder

Yea.  I feel like it must have something to do with how the UI handles these.  It seems like it's using regex and it's a bit overzealous on that regex.  Configuring the same delimited (as delimited) from a back-end perspective works fine without issues...

0 Karma
Get Updates on the Splunk Community!

Join Us at the Builder Bar at .conf24 – Empowering Innovation and Collaboration

What is the Builder Bar? The Builder Bar is more than just a place; it's a hub of creativity, collaboration, ...

Combine Multiline Logs into a Single Event with SOCK - a Guide for Advanced Users

This article is the continuation of the “Combine multiline logs into a single event with SOCK - a step-by-step ...

Everything Community at .conf24!

You may have seen mention of the .conf Community Zone 'round these parts and found yourself wondering what ...