Getting Data In

Why are larger events are truncated (10000 bytes)?

jayannah
Builder

Hi,

The data flow is UFs-->HWF-->INDEXERs

Some of the event lines sizes are 100K to 300K bytes.
By default Splunk truncated the event at 10,000 bytes.
As per the props.conf, I have put the below configuration in the Indexers

props.conf
[my-source-type]
TRUNCATE=500000

I have restarted the indexers. But still, I see that the events are getting truncated at ~10000 characters.

Do I need to put these properties in the HWF aswell?

I have not made TRUNCATE=0 because as per the documentation, often garbage is seen when set to 0. Hence I have set this to 500000 as per the discussion with developers.

props.conf...
TRUNCATE =
* Change the default maximum line length (in bytes).
* Although this is in bytes, line length is rounded down when this would
otherwise land mid-character for multi-byte characters.
* Set to 0 if you never want truncation (very long lines are, however, often a sign of
garbage data).
* Defaults to 10000 bytes.

Tags (2)
0 Karma
1 Solution

yannK
Splunk Employee
Splunk Employee

You have a heavy forwarder in the picture :
UFs-->HWF-->INDEXERs

therefore the events are not only parsed on the indexers, but on the heavy forwarder, please put a copy of the props.conf on the HWF, and restart to apply.

View solution in original post

bandit
Motivator

Splunk query to find truncation issues and also recommend a TRUNCATE parameter for props.conf.

index="_internal" sourcetype=splunkd source="*splunkd.log" log_level="WARN" "Truncating" 
| rex "line length >= (?<line_length>\d+)" 
| stats values(host) as host values(data_host) as data_host count last(_raw) as common_events last(_time) as _time max(line_length) as max_line_length by data_sourcetype log_level 
| table _time host data_host data_sourcetype log_level max_line_length count common_events 
| rename data_sourcetype as sourcetype 
| eval number=max_line_length 
| eval recommeneded_truncate=max_line_length+100000 
| eval recommeneded_truncate=recommeneded_truncate-(recommeneded_truncate%100000) 
| eval recommended_config="# props.conf
 ["+sourcetype+"]
 TRUNCATE = "+recommeneded_truncate 
| table _time host data_host sourcetype log_level max_line_length recommeneded_truncate recommended_config count common_events 
| sort -count

davedoucette
Loves-to-Learn

I have the same problem. Where do I find the config files to make the suggested changes on a windows machine?

0 Karma

erez10121012
Path Finder

works for me 🙂

0 Karma

jayannah
Builder

Event with both above said props.conf, the events not breaking correctly. I though event breaking and truncation are not related. THe events breaking at (which is incorrect) at :
<- Date: Fri, 12 Sep 2014 19:08:42 GMT
<- Access-Control-Allow-Origin: *
<- Content-Length: 295
<- Echo: bf4-bdd15
<- Access-Control-Max-Age: 3600

and also at
-> Signature: nonce="VViIjHdshDRRZake1qrL57vWMC7ynq", timestamp="1410548920", method="HMAC-SHA256", signature="FFB91*******58C7BDE7"
-> Session-Id: e97
***08cd6f5

0 Karma

jayannah
Builder

This props.conf
[my_test_app]
BREAK_ONLY_BEFORE=\d+/\d+/\d+\s+\d+:\d+:\d+\s+\w+\s+[
MAX_TIMESTAMP_LOOKAHEAD=150
NO_BINARY_CHECK=1
SHOULD_LINEMERGE=true
TRUNCATE = 500000

Actual props.conf from splunk web data preview:
BREAK_ONLY_BEFORE=^\d+/\d+/\d+\s+\d+:\d+:\d+
MAX_TIMESTAMP_LOOKAHEAD=150
NO_BINARY_CHECK=1
SHOULD_LINEMERGE=true
TRUNCATE = 500000

Both are not working

I tried with TRUNCATE =0 aswell, that is also not working. Still the events are truncated.

jayannah
Builder

Yes, I have set the LINE_BREAKER with preview mode only. Some of the sample log lines.. Edited tjust he values!!

2014/09/12 14:50:14 INFO [Orol-672] [c.gFilter] [de7e9d7dc] [811bc250] [6118f] [k2b.one.com, t_pas_12] REST client request 1724 entity:
{"sId":"4f393f7a57cbf9b6","authenticationTypeCode":"ELI","deviceFingerprintXml":"iPhone87-AC-A5BF"}

0 Karma

yannK
Splunk Employee
Splunk Employee

You have a heavy forwarder in the picture :
UFs-->HWF-->INDEXERs

therefore the events are not only parsed on the indexers, but on the heavy forwarder, please put a copy of the props.conf on the HWF, and restart to apply.

i2sheri
Communicator

I've same problem. But I've the props.conf setting only on heavy forwarders and search heads. Do I need these settings on indexers too ?

[xml]
KV_MODE = xml
DATETIME_CONFIG = NONE
BREAK_ONLY_BEFORE = ^\<?xml
MAX_EVENTS = 500
TRUNCATE = 25000
0 Karma

mufthmu
Path Finder

hi @yannK ,
I already updated the props.conf in my indexer and forwarder but my data still get truncated to 100 KB.
Do you know how to find out if my data flows thru the HWF before getting into the indexer?

0 Karma

bandit
Motivator

Hi, @mufthmu, you can look at outputs.conf on each instance to see where it's routing to. Typically, you'll need to have these line breaking rules configured on the first touch point of a full Splunk instance, whether that's a heavy forwarder or indexer.

i.e.
Universal Forwarder ---> Indexers (props.conf here)
OR
Universal Forwarder --> Heavy Forwarder(props.conf here) --> Indexers
OR
Heavy Forwarder(props.conf here) --> Indexers

I suppose you could also install in both locations (Heavy Forwarder and Indexer) if that's simpler for you.

In the outputs.conf for your Splunk instances you'll see something like the following (often port 9997)

server=<receiving_server1>, <receiving_server2>
or tcpout-server://<ipaddress_or_hostname>:<port>

if you have command line access on a Linux server you can run btool debug (your path for splunk may vary) to list out the merged configuration splunk is using for outputs.conf

example:

  /opt/splunk/bin/splunk btool --debug outputs list |egrep "server|tcpout-server"
  /opt/splunkforwarder/bin/splunk btool --debug outputs list |egrep "server|tcpout-server"
0 Karma

jayannah
Builder

Thanks Yannk and sowings... It worked after placing props.conf file at Indexers and HWFs.

0 Karma

srramu46
Engager

jayannah, Can you please send me the steps for adding props.conf to Indexers and HWF.

sowings
Splunk Employee
Splunk Employee

A heavy forwarder is an indexer with an outputs.conf. It is parsing events--it needs the LINE_BREAKER and TRUNCATE settings.

jayannah
Builder

Thanks for the response. But, HWF is just blindly streams out the incoming data right? It shouldn't truncate the event as it doesn't store. I'm think both LINE_BREAKER and TRUNCATE shouldn't be required at HWF. Please confirm

0 Karma

theouhuios
Motivator

You should set your LINE_BREAKER right. That should be the first thing to check. Please post some lines on how the event starts and how it end. Try out the Preview mode in Data inputs. Check the LINE_BREAKER and see if that solves it.

0 Karma
Get Updates on the Splunk Community!

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...

State of Splunk Careers 2024: Maximizing Career Outcomes and the Continued Value of ...

For the past four years, Splunk has partnered with Enterprise Strategy Group to conduct a survey that gauges ...

Data-Driven Success: Splunk & Financial Services

Splunk streamlines the process of extracting insights from large volumes of data. In this fast-paced world, ...