Deployment Architecture

What is the effect of annotate_punct on indexing time?

ddrillic
Ultra Champion

The architecting Splunk 7.1 Enterprise Deployments class empathizes that setting annotate_punct = false in props.conf at indexer-level can improve significantly the indexing time.

I wonder why setting it like this can improve indexing time and in which cases we should keep the punctuations field.

Tags (1)

ddrillic
Ultra Champion

Our sales engineer said -

PUNCT is exactly like it sounds; it’s an index-time field containing an ordered list of punctuations in an event. This is extremely useful for finding “patterns” of events; like a windows event where the service name and IP address would change but the event structure would remain the same.

It’s used in the background by Splunk sometimes. Very useful for eventtype, tagging, etc.

ANNOTATE_PUNCT in particular is a toggling switch for this setting. It’s on by default, but if you have;
1. Extremely long events
2. Extremely frequent events
3. Events all of the same PUNCT pattern
4. Events of all different PUNCT patterns

Than turning it off will reduce indexer CPU load on the parsing queue in the indexing pipeline.

0 Karma

ddrillic
Ultra Champion

It seems to me that most log files would fall under the 3 category - Events all of the same PUNCT pattern.

0 Karma
Get Updates on the Splunk Community!

Splunk Observability Cloud's AI Assistant in Action Series: Auditing Compliance and ...

This is the third post in the Splunk Observability Cloud’s AI Assistant in Action series that digs into how to ...

Splunk Community Badges!

  Hey everyone! Ready to earn some serious bragging rights in the community? Along with our existing badges ...

What You Read The Most: Splunk Lantern’s Most Popular Articles!

Splunk Lantern is a Splunk customer success center that provides advice from Splunk experts on valuable data ...