Getting Data In

Index Time Extractions Regex meeting character limit? (props.conf)

oliverja
Path Finder

I have been fighting with a regex in my props.conf (Regex-working-on-search-but-not-props-transforms ) and after a lot of testing, I came to the conclusion that my regex is fine. And my props.conf is fine. I had originally failed to see SOME were processing fine, and assumed all were bad. So, I checked to see if there was a cutoff where some were good vs bad, and found one, inside my len(_raw) field. Turns out, my props.conf seems to stop working somewhere after 4096 character count. 

So, I made some test data and pulled it into splunk

Bash:

 

#!/bin/bash
x="a"
count=1
while [ $count -le 5000 ]
do
  echo "$x regex"
  ((count+=1))
  x+="a"
done

 

This just makes a bunch of lines, incrementing with another char each time.

 

a regex
aa regex
aaa regex
aaaa regex
aaaaa regex
aaaaaa regex
aaaaaaa regex

 

Then I pull this into Splunk, 1 line per log. Static host field = "x"

Props:

 

[host::x]
SHOULD_LINEMERGE = false
TRANSFORMS-again = regexExtract

 

Transforms:

 

[regexExtract]
REGEX = .*(regex)
DEST_KEY= _raw
FORMAT = $1
WRITE_META = true

 

My search: 

 

 

index="delete" sourcetype="output.log"   | eval length=len(_raw) | search  length<4100  |  table length | sort - length

 

 

Assuming my KV limits are ok, because this btool snippit:

 

 

/opt/splunk/etc/system/default/limits.conf [kv]
/opt/splunk/etc/system/default/limits.conf avg_extractor_time = 500
/opt/splunk/etc/system/default/limits.conf indexed_kv_limit = 200
/opt/splunk/etc/system/default/limits.conf limit = 100
/opt/splunk/etc/system/default/limits.conf max_extractor_time = 1000
/opt/splunk/etc/system/default/limits.conf max_mem_usage_mb = 200
/opt/splunk/etc/system/default/limits.conf maxchars = 10240
/opt/splunk/etc/system/default/limits.conf maxcols = 512

 

 

------------------------------------------

Now that we have all the setup, I search my index and see that every field up thought 4097 chars long will process the index time regex.

After that, the search time regex works fine, but index time is no longer functional.

oliverja_0-1652093934827.png

 

How do i get it to continue processing beyond that ~4100 chars? 

 

 

 

Labels (1)
Tags (1)
0 Karma
1 Solution

oliverja
Path Finder

Once I had "4096" as my limit, I was able to change focus a bit and figure out what was happening. So, way way too much time fighting with my logs, I fixed it.

[regexExtract]
REGEX = .*(regex)
LOOKAHEAD=20000   <----THIS DEFAULTS TO 4096
DEST_KEY= _raw
FORMAT = $1
WRITE_META = true

 

View solution in original post

0 Karma

oliverja
Path Finder

Once I had "4096" as my limit, I was able to change focus a bit and figure out what was happening. So, way way too much time fighting with my logs, I fixed it.

[regexExtract]
REGEX = .*(regex)
LOOKAHEAD=20000   <----THIS DEFAULTS TO 4096
DEST_KEY= _raw
FORMAT = $1
WRITE_META = true

 

0 Karma
Career Survey
First 500 qualified respondents will receive a $20 gift card! Tell us about your professional Splunk journey.

Can’t make it to .conf25? Join us online!

Get Updates on the Splunk Community!

Community Content Calendar, September edition

Welcome to another insightful post from our Community Content Calendar! We're thrilled to continue bringing ...

Splunkbase Unveils New App Listing Management Public Preview

Splunkbase Unveils New App Listing Management Public PreviewWe're thrilled to announce the public preview of ...

Leveraging Automated Threat Analysis Across the Splunk Ecosystem

Are you leveraging automation to its fullest potential in your threat detection strategy?Our upcoming Security ...