<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: truncation issue: identifying where it's happening. in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/truncation-issue-identifying-where-it-s-happening/m-p/43235#M8059</link>
    <description>&lt;P&gt;Can you try logging to a TCP input?  I imagine you are are hitting MTU for those UDP log packets.  In our environment, we log to syslog-ng and then index those logs, taking the host off the segment on the path.  The results are more predictable than using a Splunk input.  Hope this helps.&lt;/P&gt;</description>
    <pubDate>Fri, 23 Aug 2013 03:28:43 GMT</pubDate>
    <dc:creator>fervin</dc:creator>
    <dc:date>2013-08-23T03:28:43Z</dc:date>
    <item>
      <title>truncation issue: identifying where it's happening.</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/truncation-issue-identifying-where-it-s-happening/m-p/43232#M8056</link>
      <description>&lt;P&gt;We are using splunk to collect logs from a java-based application. Our logging configuration is as follows:&lt;/P&gt;

&lt;P&gt;java app uses a syslog appender, configured to log to udp:localhost:514. Splunk forwarder is installed on the app server, with a udp:514 listener (overridden to a log4j sourcetype). splunkforwarder then forwards to our splunk indexer using standard forwarding (tcp:9997) across a network. The system has worked well for us until now, as unrealiable UDP was only used over a local interface, so the chance of loosing packets is minimal. Until now: we have added some additional logging into our application, and have found that events logged from our application are getting truncated. The pattern seems to be to truncate at a 64k mark. Here's some of the relevant configs in splunk:&lt;/P&gt;

&lt;P&gt;inputs.conf&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[udp://514]
connection_host = none
sourcetype = log4j
_rcvbuf = 3145728
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;props.conf&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[default]
CHARSET = UTF-8
LINE_BREAKER_LOOKBEHIND = 1000
TRUNCATE = 100000
DATETIME_CONFIG = /etc/datetime.xml
ANNOTATE_PUNCT = True
HEADER_MODE =
MAX_DAYS_HENCE=2
MAX_DAYS_AGO=2000
MAX_DIFF_SECS_AGO=3600
MAX_DIFF_SECS_HENCE=604800
MAX_TIMESTAMP_LOOKAHEAD = 128
SHOULD_LINEMERGE = true
BREAK_ONLY_BEFORE =
BREAK_ONLY_BEFORE_DATE = true
MAX_EVENTS = 7000
MUST_BREAK_AFTER =
MUST_NOT_BREAK_AFTER =
MUST_NOT_BREAK_BEFORE =
TRANSFORMS =
SEGMENTATION          = indexing
SEGMENTATION-all      = full
SEGMENTATION-inner    = inner
SEGMENTATION-outer    = outer
SEGMENTATION-raw      = none
SEGMENTATION-standard = standard
LEARN_SOURCETYPE      = false
maxDist = 100

[log4j]
MAX_EVENTS = 7000
SHOULD_LINEMERGE = true
TRUNCATE = 100000
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Where should we be looking for the truncation? &lt;/P&gt;</description>
      <pubDate>Thu, 22 Aug 2013 12:06:43 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/truncation-issue-identifying-where-it-s-happening/m-p/43232#M8056</guid>
      <dc:creator>brettcave</dc:creator>
      <dc:date>2013-08-22T12:06:43Z</dc:date>
    </item>
    <item>
      <title>Re: truncation issue: identifying where it's happening.</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/truncation-issue-identifying-where-it-s-happening/m-p/43233#M8057</link>
      <description>&lt;P&gt;a note on the above config, I have just added in the rcvbuf parameter now to ensure it's not a buffering issue, but it's not.&lt;/P&gt;</description>
      <pubDate>Thu, 22 Aug 2013 12:14:05 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/truncation-issue-identifying-where-it-s-happening/m-p/43233#M8057</guid>
      <dc:creator>brettcave</dc:creator>
      <dc:date>2013-08-22T12:14:05Z</dc:date>
    </item>
    <item>
      <title>Re: truncation issue: identifying where it's happening.</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/truncation-issue-identifying-where-it-s-happening/m-p/43234#M8058</link>
      <description>&lt;P&gt;some other methods we have used to try debug include: turning off the splunk forwarder and using &lt;CODE&gt;nc -l -u localhost 514 | tee manual_log.log&lt;/CODE&gt; to create a UDP listener to log to file, but netcat seems to hang when we get to the big logs... have also tried saving an example of a truncated log to file and piping it into the forwarder: &lt;CODE&gt;nc -u localhost 514 &amp;lt; biglogentry&lt;/CODE&gt;, but that doesn't show in the indexer.&lt;/P&gt;</description>
      <pubDate>Thu, 22 Aug 2013 12:17:22 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/truncation-issue-identifying-where-it-s-happening/m-p/43234#M8058</guid>
      <dc:creator>brettcave</dc:creator>
      <dc:date>2013-08-22T12:17:22Z</dc:date>
    </item>
    <item>
      <title>Re: truncation issue: identifying where it's happening.</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/truncation-issue-identifying-where-it-s-happening/m-p/43235#M8059</link>
      <description>&lt;P&gt;Can you try logging to a TCP input?  I imagine you are are hitting MTU for those UDP log packets.  In our environment, we log to syslog-ng and then index those logs, taking the host off the segment on the path.  The results are more predictable than using a Splunk input.  Hope this helps.&lt;/P&gt;</description>
      <pubDate>Fri, 23 Aug 2013 03:28:43 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/truncation-issue-identifying-where-it-s-happening/m-p/43235#M8059</guid>
      <dc:creator>fervin</dc:creator>
      <dc:date>2013-08-23T03:28:43Z</dc:date>
    </item>
    <item>
      <title>Re: truncation issue: identifying where it's happening.</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/truncation-issue-identifying-where-it-s-happening/m-p/43236#M8060</link>
      <description>&lt;P&gt;i tried with netcat, and found that we were only getting 1024 bytes, so definitely seems to be related to network buffers / units. &lt;CODE&gt;ifconfig lo&lt;/CODE&gt; gives &lt;CODE&gt;lo: flags=73&amp;lt;UP,LOOPBACK,RUNNING&amp;gt;  mtu 65536&lt;/CODE&gt; - that exactly matches the size of the logs, so I guess there are 2 approaches: 1 - override localhost MTU. 2 - configure the application to log these events to file and add a file monitor to splunk. The logging framework doesn't offer a TCP syslog, and the TCP logging option ("socket appender") uses escape sequences which come through to splunk as literal characters.&lt;/P&gt;</description>
      <pubDate>Fri, 23 Aug 2013 10:09:48 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/truncation-issue-identifying-where-it-s-happening/m-p/43236#M8060</guid>
      <dc:creator>brettcave</dc:creator>
      <dc:date>2013-08-23T10:09:48Z</dc:date>
    </item>
  </channel>
</rss>

