All Apps and Add-ons

Field extraction a problem with large events?

wmosher
Path Finder

Automatic field extraction doesn't appear to always work after a certain number of characters deep into a (single line) event. One of our data types always ends in a key=value timinging of how long the transaction took to process. It appears that after around 10,000 bytes of data the field does not automatically get extracted. The event is not visibly truncated, and I can see the field we want in _raw. I can even rex it out just fine. Is there a setting somewhere that tells the field extractor how deep in an event to look?

1 Solution

dwaddle
SplunkTrust
SplunkTrust

Yes, there is a limit for auto-kv extraction. The default is 10,240 chars. To change this, add/edit $SPLUNK_HOME/etc/system/local/limits.conf with this stanza/setting:

[kv]
# truncate _raw to to this size and then do auto KV
# 20480, or whatever value you otherwise desire
maxchars = 20480

I'm not sure how far I would be willing to turn this setting up. At some point, it could begin to negatively impact search performance. You may see a small increase in CPU usage during searches as you raise this.

View solution in original post

dwaddle
SplunkTrust
SplunkTrust

Yes, there is a limit for auto-kv extraction. The default is 10,240 chars. To change this, add/edit $SPLUNK_HOME/etc/system/local/limits.conf with this stanza/setting:

[kv]
# truncate _raw to to this size and then do auto KV
# 20480, or whatever value you otherwise desire
maxchars = 20480

I'm not sure how far I would be willing to turn this setting up. At some point, it could begin to negatively impact search performance. You may see a small increase in CPU usage during searches as you raise this.

dwaddle
SplunkTrust
SplunkTrust

Well, I would not do index-time extraction. Perhaps a search-time regex extraction, but it's difficult to say without measurement whether it will be better than auto-kv. Theoretically, increasing this from 10,240 to 102,400 increases the amount of CPU usage by 10x (assuming an O(n) operation). Practically, this may only mean a handful of nanoseconds. One advantage to regex versus auto-kv is that you can limit the regex scope to particular sourcetypes. Raising the kv limit affects processing for every event. Best advice is "measure and compare".

wmosher
Path Finder

Thanks dwaddle this is exactly what I'm looking for.

We have the occasional event above 51,200 characters. The field I am interested in is always the last thing in the event. Since performance could be a concern would it make more sense to extract that field at index time with a transform or would this also be just as taxing?

Got questions? Get answers!

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Meet up IRL or virtually!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Get Updates on the Splunk Community!

Announcing Modern Navigation: A New Era of Splunk User Experience

We are excited to introduce the Modern Navigation feature in the Splunk Platform, available to both cloud and ...

Modernize your Splunk Apps – Introducing Python 3.13 in Splunk

We are excited to announce that the upcoming releases of Splunk Enterprise 10.2.x and Splunk Cloud Platform ...

Step into “Hunt the Insider: An Splunk ES Premier Mystery” to catch a cybercriminal ...

After a whole week of being on call, you fell asleep on your keyboard, and you hit a sequence of buttons that ...