<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: json gets truncated in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/json-gets-truncated/m-p/470190#M80859</link>
    <description>&lt;P&gt;&lt;A href="https://answers.splunk.com/answers/577522/json-file-getting-truncated.html"&gt;json-file-getting-truncated&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;how about this?&lt;/P&gt;</description>
    <pubDate>Sat, 28 Dec 2019 00:54:05 GMT</pubDate>
    <dc:creator>to4kawa</dc:creator>
    <dc:date>2019-12-28T00:54:05Z</dc:date>
    <item>
      <title>json gets truncated</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/json-gets-truncated/m-p/470189#M80858</link>
      <description>&lt;P&gt;Valid json gets truncated for some reason. Below is the props.conf file:&lt;/P&gt;

&lt;P&gt;TRUNCATE = 0&lt;BR /&gt;
KV_MODE = json&lt;BR /&gt;
NO_BINARY_CHECK = true&lt;BR /&gt;
BREAK_ONLY_BEFORE = ^\x7B&lt;BR /&gt;
LINE_BREAKER = ([\r\n]+)(\x7B)&lt;BR /&gt;
SHOULD_LINEMERGE = false&lt;BR /&gt;
DATETIME_CONFIG = CURRENT&lt;/P&gt;

&lt;P&gt;Any suggestions?&lt;/P&gt;</description>
      <pubDate>Wed, 30 Sep 2020 03:26:38 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/json-gets-truncated/m-p/470189#M80858</guid>
      <dc:creator>gkapitany</dc:creator>
      <dc:date>2020-09-30T03:26:38Z</dc:date>
    </item>
    <item>
      <title>Re: json gets truncated</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/json-gets-truncated/m-p/470190#M80859</link>
      <description>&lt;P&gt;&lt;A href="https://answers.splunk.com/answers/577522/json-file-getting-truncated.html"&gt;json-file-getting-truncated&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;how about this?&lt;/P&gt;</description>
      <pubDate>Sat, 28 Dec 2019 00:54:05 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/json-gets-truncated/m-p/470190#M80859</guid>
      <dc:creator>to4kawa</dc:creator>
      <dc:date>2019-12-28T00:54:05Z</dc:date>
    </item>
    <item>
      <title>Re: json gets truncated</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/json-gets-truncated/m-p/470191#M80860</link>
      <description>&lt;P&gt;I've tried also a few alternate line breakers with no success:&lt;/P&gt;

&lt;P&gt;LINE_BREAKER = ([\r\n]+)(\x7B(\x22))audit&lt;BR /&gt;
LINE_BREAKER = ([\r\n]+)(\x7B)audit&lt;BR /&gt;
LINE_BREAKER = ([\r\n]*)(?={)&lt;BR /&gt;
no line breaker and INDEXED_EXTRACTIONS = json&lt;/P&gt;

&lt;P&gt;Below is the beginning of the json (truncated here to keep the post clean)&lt;BR /&gt;
{"audit":"16463","hostScore":"0","name":"to8pt.sample.com","macAddress":"","os":"OS Undetermined",&lt;/P&gt;</description>
      <pubDate>Wed, 30 Sep 2020 03:29:44 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/json-gets-truncated/m-p/470191#M80860</guid>
      <dc:creator>gkapitany</dc:creator>
      <dc:date>2020-09-30T03:29:44Z</dc:date>
    </item>
    <item>
      <title>Re: json gets truncated</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/json-gets-truncated/m-p/470192#M80861</link>
      <description>&lt;P&gt;&lt;CODE&gt;LINE_BREAKER = ([\r\n]*)(?=\{)&lt;/CODE&gt;&lt;BR /&gt;
How about this?&lt;BR /&gt;
you should do escape the character "{"&lt;/P&gt;</description>
      <pubDate>Mon, 30 Dec 2019 21:32:57 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/json-gets-truncated/m-p/470192#M80861</guid>
      <dc:creator>to4kawa</dc:creator>
      <dc:date>2019-12-30T21:32:57Z</dc:date>
    </item>
    <item>
      <title>Re: json gets truncated</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/json-gets-truncated/m-p/470193#M80862</link>
      <description>&lt;P&gt;It didn't work either.&lt;/P&gt;</description>
      <pubDate>Tue, 31 Dec 2019 18:17:10 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/json-gets-truncated/m-p/470193#M80862</guid>
      <dc:creator>gkapitany</dc:creator>
      <dc:date>2019-12-31T18:17:10Z</dc:date>
    </item>
    <item>
      <title>Re: json gets truncated</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/json-gets-truncated/m-p/470194#M80863</link>
      <description>&lt;PRE&gt;&lt;CODE&gt;TRUNCATE = 0
KV_MODE = json
NO_BINARY_CHECK = true
LINE_BREAKER = ([\r\n]+)(?=\{)
SHOULD_LINEMERGE = false
DATETIME_CONFIG = CURRENT
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;How many logs are actually there and how many are trancated?&lt;BR /&gt;
Also, is it LINE_BREAKER that doesn't work?&lt;BR /&gt;
I think it's different from your question.&lt;/P&gt;

&lt;P&gt;&lt;A href="https://answers.splunk.com/answers/511589/how-to-configure-line-breaking-for-my-sample-json.html"&gt;JSON props.conf&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 31 Dec 2019 22:01:25 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/json-gets-truncated/m-p/470194#M80863</guid>
      <dc:creator>to4kawa</dc:creator>
      <dc:date>2019-12-31T22:01:25Z</dc:date>
    </item>
    <item>
      <title>Re: json gets truncated</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/json-gets-truncated/m-p/470195#M80864</link>
      <description>&lt;P&gt;Could you try to do a &lt;CODE&gt;| eval eventlenght = len(_raw)&lt;/CODE&gt; to see if Splunk truncates at the same position every time?&lt;/P&gt;</description>
      <pubDate>Wed, 01 Jan 2020 13:52:45 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/json-gets-truncated/m-p/470195#M80864</guid>
      <dc:creator>rvaglid</dc:creator>
      <dc:date>2020-01-01T13:52:45Z</dc:date>
    </item>
    <item>
      <title>Re: json gets truncated</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/json-gets-truncated/m-p/470196#M80865</link>
      <description>&lt;P&gt;Everything gets truncated eventually, unless you use the (somewhat dangerous) &lt;CODE&gt;TRUNCATE = 0&lt;/CODE&gt; setting.  Up your value for &lt;CODE&gt;TRUNCATE&lt;/CODE&gt;:&lt;BR /&gt;
&lt;A href="https://docs.splunk.com/Documentation/Splunk/latest/Admin/Propsconf"&gt;https://docs.splunk.com/Documentation/Splunk/latest/Admin/Propsconf&lt;/A&gt;&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;#******************************************************************************
# Line breaking
#******************************************************************************

# Use the following attributes to define the length of a line.

TRUNCATE = &amp;lt;non-negative integer&amp;gt;
 * Change the default maximum line length (in bytes).
 * Although this is in bytes, line length is rounded down when this would
  otherwise land mid-character for multi-byte characters.
 * Set to 0 if you never want truncation (very long lines are, however, often a sign of
  garbage data).
 * Defaults to 10000 bytes.
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;You should be getting logs like this:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;01-01-2020 18:40:37.625 +0000 WARN LineBreakingProcessor - Truncating line because limit of 10000 has been exceeded
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 01 Jan 2020 16:08:24 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/json-gets-truncated/m-p/470196#M80865</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2020-01-01T16:08:24Z</dc:date>
    </item>
    <item>
      <title>Re: json gets truncated</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/json-gets-truncated/m-p/470197#M80866</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;

&lt;P&gt;No , there isn't any log record about truncation due to length. The reason I set TRUNCATE = 0 was to eliminate any potential issue due to length. The intent is to set it to 30000 once I figure out why it gets truncated. &lt;/P&gt;

&lt;P&gt;All error messages are like the one below but with different values:&lt;BR /&gt;
 ERROR JsonLineBreaker - JSON StreamId:18294845293918380307 had parsing error:Unexpected character while looking for value: 'r' - da&lt;BR /&gt;
ta_source="/opt/splunk/vne2splunk/log.json", data_host="splmx1.sample.com", data_sourcetype="_json"&lt;/P&gt;

&lt;P&gt;Some logs are parsed correctly like the one below:&lt;BR /&gt;
{&lt;BR /&gt;
    "audit": "16489",&lt;BR /&gt;
    "hostScore": "0",&lt;BR /&gt;
    "name": "to8pt.sample.com",&lt;BR /&gt;
    "macAddress": "",&lt;BR /&gt;
    "os": "OS Undetermined",&lt;BR /&gt;
    "vulnerabilities": "1",&lt;BR /&gt;
    "netbiosName": "",&lt;BR /&gt;
    "application": {&lt;BR /&gt;
        "": "port - 5040",&lt;BR /&gt;
        "id: 6119 Application: DCE/MS RPC Endpoint Mapper Interface (TCP) description: DCE/MS RPC Endpoint Mapper Interface. parent: 165": "port - 135",&lt;BR /&gt;
        "id: 165 Service: DCE/MS RPC over TCP description: Microsoft RPC (Remote Procedure Call) over TCP is used by many services, including: DHCP Manager, DNS Administration, WINS Manager, Exchange Client/Server, Exchange Administrator and RPC. Third party applications, such as Symantec/Veritas BackupExec, may also make use of it. protocol: tcp transport: n/a parentid: n/a": "port - 135",&lt;BR /&gt;
        "id: 8037 Service: IPv4 Layer 4 description: Generic Layer 3 / Layer 4 RAW socket access. protocol: ip transport: n/a parentid: n/a": "port - 0"&lt;BR /&gt;
    },&lt;BR /&gt;
    "timeStamp": "2020-01-02 00:03:56",&lt;BR /&gt;
    "ipAddress": "172.16.25.32",&lt;BR /&gt;
    "id": "4128157",&lt;BR /&gt;
    "network": "INT - Transports"&lt;BR /&gt;
}&lt;/P&gt;

&lt;P&gt;The only difference is that the "application " object varies in length. One example  I have is in Splunk gets truncated at 14,532  character, but the original json has  15,071 characters.&lt;/P&gt;

&lt;P&gt;This leads me to believe that the issue is related to some character sequence but not sure which one.&lt;/P&gt;</description>
      <pubDate>Wed, 30 Sep 2020 03:34:56 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/json-gets-truncated/m-p/470197#M80866</guid>
      <dc:creator>gkapitany</dc:creator>
      <dc:date>2020-09-30T03:34:56Z</dc:date>
    </item>
    <item>
      <title>Re: json gets truncated</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/json-gets-truncated/m-p/470198#M80867</link>
      <description>&lt;P&gt;Hi rvaglid,&lt;/P&gt;

&lt;P&gt;I ran the suggested eval on a few entries and the truncation position is not consistent:&lt;BR /&gt;
11170&lt;BR /&gt;
12231&lt;BR /&gt;
13721&lt;BR /&gt;
11331&lt;/P&gt;

&lt;P&gt;Like I mentioned above it doesn't appear that the truncation occurs due to length but rather a character sequence.  &lt;/P&gt;</description>
      <pubDate>Thu, 02 Jan 2020 13:24:10 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/json-gets-truncated/m-p/470198#M80867</guid>
      <dc:creator>gkapitany</dc:creator>
      <dc:date>2020-01-02T13:24:10Z</dc:date>
    </item>
  </channel>
</rss>

