Getting Data In

What is the effect of the LineBreakingProcessor on performance?

paimonsoror
Builder

I was wondering what the performance impact is on the indexers when lines are being truncated. I have noticed that recently there have been several events coming in from one of my larger clients that has been truncated. A few of our hadoop and tomcat clients are recently dumping a lot of data to logs, and unfortunately we haven't had the chance to increase the truncation limits on the sources.

Is there any performance hit every time that truncation occurs?

01-20-2017 20:48:35.429 -0500 WARN  LineBreakingProcessor - Truncating line because limit of 10000 bytes has been exceeded with a line length >= 20423 - data_source="/logs/tomcat12/catalina.out", data_host="cimasked", data_sourcetype="cis:tomcat:catalina"
0 Karma
1 Solution

hunters_splunk
Splunk Employee
Splunk Employee

Hi paimonsoror,

The linebreaker processor consumes CPU power, but not as much as the aggregator processor.
You can view Estimated CPU Usage per Splunk Processor in the Monitoring Console:

  1. In Splunk Web, click Settings > Monitoring Console from the menu.
  2. In the Monitoring Console, select Indexing > Performance > Indexing Performance: Instance from the menu.
  3. Scroll down the page to the Estimated CPU Usage per Splunk Processor panel to view the CPU usage by all available processors.

Hope this helps. Thanks!
Hunter

View solution in original post

jkat54
SplunkTrust
SplunkTrust

This message you are seeing means events are not seeing the line breaker you've specified even 10000 bytes into the message. There isn't really much of an impact if you set truncate higher in this case. If anything it may increase performance because you won't be writing so many logs to the disk regarding the truncation limit.

Now I say this but let me caution you, if your events aren't supposed to be 20000+ bytes in size, then you do not have an appropriate line breaker to begin with and that may very well cause performance issues. However that scenario is exactly why the default truncate is set to 10000 bytes. In case you do mess up your line breaker, it stops Splunk from loading 10 billion bytes into ram and running a regex over it to find line breaks.

paimonsoror
Builder

Thanks for the response!!

0 Karma

hunters_splunk
Splunk Employee
Splunk Employee

Hi paimonsoror,

The linebreaker processor consumes CPU power, but not as much as the aggregator processor.
You can view Estimated CPU Usage per Splunk Processor in the Monitoring Console:

  1. In Splunk Web, click Settings > Monitoring Console from the menu.
  2. In the Monitoring Console, select Indexing > Performance > Indexing Performance: Instance from the menu.
  3. Scroll down the page to the Estimated CPU Usage per Splunk Processor panel to view the CPU usage by all available processors.

Hope this helps. Thanks!
Hunter

sarvesh_11
Communicator

Hey @hunters_splunk ,
that was helpful, just want to know more on this.
Truncating is only to do with Line Breaker Processor, not the indexing?
Also does it affect only CPU not the RAM?

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...