<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Need help with linebreaker for array of json objects in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Need-help-with-linebreaker-for-array-of-json-objects/m-p/211185#M61743</link>
    <description>&lt;P&gt;Are your events breaking correctly? If you have set LINE_BREAKER then SHOULD_LINEMERGE should be set to false, not true. For some reason, setting this through the UI does not work, Splunk just reverts it back to true and adds in a BREAK_ONLY_BEFORE setting as well as the line breaker. This could be causing part of the problem that you are seeing ...&lt;/P&gt;</description>
    <pubDate>Tue, 29 Sep 2020 11:04:54 GMT</pubDate>
    <dc:creator>lquinn</dc:creator>
    <dc:date>2020-09-29T11:04:54Z</dc:date>
    <item>
      <title>Need help with linebreaker for array of json objects</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Need-help-with-linebreaker-for-array-of-json-objects/m-p/211184#M61742</link>
      <description>&lt;P&gt;I am indexing json files.  Each file contains an array of  around 1,000 json objects (with nested arrays/objects).  I need to extract each object as a single event.  (See sample json source and props.conf below).  &lt;/P&gt;

&lt;P&gt;I use the "add data" button on the UI to index the file, it looks like it gets all the events.  If I just do a search for all the events, the first json object does show up.  However, it looks like the KV_MODE=json stumbles on the initial [ and is unable to extract the fields.  Because if I search for one of the fields in the data  &lt;EM&gt;(index=foo coach="matt")&lt;/EM&gt;, the event is not returned.    However, if I search for just the value of the field *(index=foo matt), the event is returned.&lt;/P&gt;

&lt;P&gt;How do I modify my props.conf to correctly handle the first object in the array?&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[
    {    
        "team" : "spirit",        
        "coach": "matt",
        "regDate": "2016-07-31T12:23:34Z",
        "players": [
          {
            "name":"Marissa",
            "positions": ["2B", "P", "C", "RF"]
          },
          {
            "name":"Sierra",
            "positions": ["SS","LF"]
          }
        ]
    },
    {    
        "team" : "chill",        
        "coach": "bob"
        "regDate": "2016-08-01T12:15:19Z",
        "players": [
          {
            "name":"Rhi",
            "positions": ["3B", "CF","1B"]
          }
        ]
    }
]
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;This is my props.conf:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt; [json_linebreaker]
 JSON_TRIM_BRACES_IN_ARRAY_NAMES=true
 KV_MODE=json
 LINE_BREAKER=\s{4}\},(,[\n\r])\s{4}\{(.*)
 MAX_TIMESTAMP_LOOKAHEAD=30
 NO_BINARY_CHECK=true
 SHOULD_LINEMERGE=true
 TIME_FORMAT=%Y-%m-%dT%H:%M:%S%Z
 TIME_PREFIX=regDate\"\s*:\s*\"
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 22 Sep 2016 21:14:29 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Need-help-with-linebreaker-for-array-of-json-objects/m-p/211184#M61742</guid>
      <dc:creator>lyndac</dc:creator>
      <dc:date>2016-09-22T21:14:29Z</dc:date>
    </item>
    <item>
      <title>Re: Need help with linebreaker for array of json objects</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Need-help-with-linebreaker-for-array-of-json-objects/m-p/211185#M61743</link>
      <description>&lt;P&gt;Are your events breaking correctly? If you have set LINE_BREAKER then SHOULD_LINEMERGE should be set to false, not true. For some reason, setting this through the UI does not work, Splunk just reverts it back to true and adds in a BREAK_ONLY_BEFORE setting as well as the line breaker. This could be causing part of the problem that you are seeing ...&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 11:04:54 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Need-help-with-linebreaker-for-array-of-json-objects/m-p/211185#M61743</guid>
      <dc:creator>lquinn</dc:creator>
      <dc:date>2020-09-29T11:04:54Z</dc:date>
    </item>
    <item>
      <title>Re: Need help with linebreaker for array of json objects</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Need-help-with-linebreaker-for-array-of-json-objects/m-p/211186#M61744</link>
      <description>&lt;P&gt;The events are breaking correctly, it's just that pesky initial square bracket.  I changed SHOULD_LINEMERGE to false and it didn't seem to change anything.&lt;/P&gt;</description>
      <pubDate>Fri, 23 Sep 2016 12:42:49 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Need-help-with-linebreaker-for-array-of-json-objects/m-p/211186#M61744</guid>
      <dc:creator>lyndac</dc:creator>
      <dc:date>2016-09-23T12:42:49Z</dc:date>
    </item>
    <item>
      <title>Re: Need help with linebreaker for array of json objects</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Need-help-with-linebreaker-for-array-of-json-objects/m-p/211187#M61745</link>
      <description>&lt;P&gt;I've been playing with the regex all day today.  The most recent incantation is:&lt;/P&gt;

&lt;P&gt;LINE_BREAKER=(^[[\n\r]+)|\s{4}},(,[\n\r])\s{4}{(.*)&lt;/P&gt;

&lt;P&gt;My thinking was if I could break the [ into its own event, then I could throw away that event using a transform.  However, it is still keeping the [ with the first object and now is splitting the event at random spots.&lt;/P&gt;</description>
      <pubDate>Fri, 23 Sep 2016 22:21:46 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Need-help-with-linebreaker-for-array-of-json-objects/m-p/211187#M61745</guid>
      <dc:creator>lyndac</dc:creator>
      <dc:date>2016-09-23T22:21:46Z</dc:date>
    </item>
    <item>
      <title>Re: Need help with linebreaker for array of json objects</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Need-help-with-linebreaker-for-array-of-json-objects/m-p/211188#M61746</link>
      <description>&lt;P&gt;Finally got this working by using a PREAMBLE_REGEX to discard the opening array bracket.  Posting the props.conf here for completeness (in case someone else has this issue).&lt;/P&gt;

&lt;P&gt;[json_linebreaker]&lt;BR /&gt;
JSON_TRIM_BRACES_IN_ARRAY_NAMES=true&lt;BR /&gt;
KV_MODE=json&lt;BR /&gt;
PREAMBLE_REGEX=^\s{0,2}[&lt;BR /&gt;
LINE_BREAKER=\s{4}},(,[\n\r])\s{4}({.&lt;EM&gt;)&lt;BR /&gt;
MAX_TIMESTAMP_LOOKAHEAD=30&lt;BR /&gt;
NO_BINARY_CHECK=true&lt;BR /&gt;
SHOULD_LINEMERGE=false&lt;BR /&gt;
TIME_FORMAT=%Y-%m-%dT%H:%M:%S%Z&lt;BR /&gt;
TIME_PREFIX=regDate\"\s&lt;/EM&gt;:\s*\"&lt;/P&gt;</description>
      <pubDate>Tue, 29 Sep 2020 11:13:11 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Need-help-with-linebreaker-for-array-of-json-objects/m-p/211188#M61746</guid>
      <dc:creator>lyndac</dc:creator>
      <dc:date>2020-09-29T11:13:11Z</dc:date>
    </item>
    <item>
      <title>Re: Need help with linebreaker for array of json objects</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Need-help-with-linebreaker-for-array-of-json-objects/m-p/211189#M61747</link>
      <description>&lt;P&gt;Of course I only have a small set for your data, but this seems to be working. The main challenge is to line break as you mentioned. Assuming that the first element of the json object is always the same ( in your case, it starts with "team", then this regex should work. &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;LINE_BREAKER = (,*\s+){\s+"team"
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Once you have events breaking properly, the only thing you have left is to clean up opening and closing square brackets with SEDCMD. Finished Props looks like this: &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[answers]
LINE_BREAKER = (,*\s+){\s+"team"
TIME_PREFIX = regDate":\s"
MAX_TIMESTAMP_LOOKAHEAD = 30
NO_BINARY_CHECK = true
disabled = false
KV_MODE = json
SEDCMD-remove_opening = s/^\[//g
SEDCMD-remove_cloing = s/\]$//g
JSON_TRIM_BRACES_IN_ARRAY_NAMES = true
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I had a similar issue, but my json objects was wrapped yet in another json array. Same solution worked there too. As long as you can line break on the first field of the object - you should be fine. &lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;   [
  "Records": [
    {
        "team" : "spirit",
        "coach": "matt",
        "regDate": "2016-07-31T12:23:34Z",
    },
    {
        "team" : "chill",
        "coach": "bob"
        "regDate": "2016-08-01T12:15:19Z",
    }
]
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I also spoke with someone from Splunk and they do realize that json array is a common data structure nowadays and they do have an internal Jira task for it as a feature request. &lt;/P&gt;

&lt;P&gt;I hope it helps!&lt;/P&gt;</description>
      <pubDate>Fri, 16 Jun 2017 18:11:22 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Need-help-with-linebreaker-for-array-of-json-objects/m-p/211189#M61747</guid>
      <dc:creator>aliakseidzianis</dc:creator>
      <dc:date>2017-06-16T18:11:22Z</dc:date>
    </item>
    <item>
      <title>Re: Need help with linebreaker for array of json objects</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Need-help-with-linebreaker-for-array-of-json-objects/m-p/211190#M61748</link>
      <description>&lt;P&gt;Thank you so much. This helped a ton !!&lt;/P&gt;</description>
      <pubDate>Thu, 06 Feb 2020 01:11:00 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Need-help-with-linebreaker-for-array-of-json-objects/m-p/211190#M61748</guid>
      <dc:creator>kkrishnan_splun</dc:creator>
      <dc:date>2020-02-06T01:11:00Z</dc:date>
    </item>
  </channel>
</rss>

