Getting Data In

What is the effect of the LineBreakingProcessor on performance?

Builder

I was wondering what the performance impact is on the indexers when lines are being truncated. I have noticed that recently there have been several events coming in from one of my larger clients that has been truncated. A few of our hadoop and tomcat clients are recently dumping a lot of data to logs, and unfortunately we haven't had the chance to increase the truncation limits on the sources.

Is there any performance hit every time that truncation occurs?

01-20-2017 20:48:35.429 -0500 WARN  LineBreakingProcessor - Truncating line because limit of 10000 bytes has been exceeded with a line length >= 20423 - data_source="/logs/tomcat12/catalina.out", data_host="cimasked", data_sourcetype="cis:tomcat:catalina"
0 Karma
1 Solution

Splunk Employee
Splunk Employee

Hi paimonsoror,

The linebreaker processor consumes CPU power, but not as much as the aggregator processor.
You can view Estimated CPU Usage per Splunk Processor in the Monitoring Console:

  1. In Splunk Web, click Settings > Monitoring Console from the menu.
  2. In the Monitoring Console, select Indexing > Performance > Indexing Performance: Instance from the menu.
  3. Scroll down the page to the Estimated CPU Usage per Splunk Processor panel to view the CPU usage by all available processors.

Hope this helps. Thanks!
Hunter

View solution in original post

SplunkTrust
SplunkTrust

This message you are seeing means events are not seeing the line breaker you've specified even 10000 bytes into the message. There isn't really much of an impact if you set truncate higher in this case. If anything it may increase performance because you won't be writing so many logs to the disk regarding the truncation limit.

Now I say this but let me caution you, if your events aren't supposed to be 20000+ bytes in size, then you do not have an appropriate line breaker to begin with and that may very well cause performance issues. However that scenario is exactly why the default truncate is set to 10000 bytes. In case you do mess up your line breaker, it stops Splunk from loading 10 billion bytes into ram and running a regex over it to find line breaks.

Builder

Thanks for the response!!

0 Karma

Splunk Employee
Splunk Employee

Hi paimonsoror,

The linebreaker processor consumes CPU power, but not as much as the aggregator processor.
You can view Estimated CPU Usage per Splunk Processor in the Monitoring Console:

  1. In Splunk Web, click Settings > Monitoring Console from the menu.
  2. In the Monitoring Console, select Indexing > Performance > Indexing Performance: Instance from the menu.
  3. Scroll down the page to the Estimated CPU Usage per Splunk Processor panel to view the CPU usage by all available processors.

Hope this helps. Thanks!
Hunter

View solution in original post

Path Finder

Hey @hunters_splunk ,
that was helpful, just want to know more on this.
Truncating is only to do with Line Breaker Processor, not the indexing?
Also does it affect only CPU not the RAM?

0 Karma
State of Splunk Careers

Access the Splunk Careers Report to see real data that shows how Splunk mastery increases your value and job satisfaction.

Find out what your skills are worth!