<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Json event breaking not working as expected in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/Json-event-breaking-not-working-as-expected/m-p/468593#M80661</link>
    <description>&lt;P&gt;I'd recommend using explicit &lt;CODE&gt;LINE_BREAKER&lt;/CODE&gt; and &lt;CODE&gt;SHOULD_LINEMERGE=false&lt;/CODE&gt;. That is much more predictable and is also more performant.&lt;/P&gt;

&lt;P&gt;Something like this should work for your data:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;LINE_BREAKER = ([\r\n]*\[|,\s+)\{"username":
SHOULD_LINEMERGE=false
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;This also automatically takes care of stripping the leading &lt;CODE&gt;[&lt;/CODE&gt; or &lt;CODE&gt;,&lt;/CODE&gt; in between records. Only SEDCMD needed is stripping of the trailing &lt;CODE&gt;]&lt;/CODE&gt;. See: &lt;A href="https://regex101.com/r/8zGyMS/1"&gt;https://regex101.com/r/8zGyMS/1&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;Note: SEDCMD applies after line breaking.&lt;/P&gt;</description>
    <pubDate>Fri, 20 Dec 2019 11:23:03 GMT</pubDate>
    <dc:creator>FrankVl</dc:creator>
    <dc:date>2019-12-20T11:23:03Z</dc:date>
    <item>
      <title>Json event breaking not working as expected</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Json-event-breaking-not-working-as-expected/m-p/468592#M80660</link>
      <description>&lt;P&gt;Original log:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[{"username": "xxx", "event": "session_start", "event_category": "session", "timestamp": "2019-12-11 08:26:23.547000+00:00", "context_ip": "xxx", "context_page_referrer": "xxx", "context_page_url": "xxx", "context_page_search": null, "context_user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36", "context_data": null, "response": null}, {"username": "xxx", "event": "session_start", "event_category": "session", "timestamp": "2019-12-11 12:53:32.350000+00:00", "context_ip": "xxx", "context_page_referrer": null, "context_page_url": "xxx", "context_page_search": null, "context_user_agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36", "context_data": null, "response": null}]
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Expected logs:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;{"username": "xxx", "event": "session_start", "event_category": "session", "timestamp": "2019-12-11 08:26:23.547000+00:00", "context_ip": "xxx", "context_page_referrer": "xxx", "context_page_url": "xxx", "context_page_search": null, "context_user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36", "context_data": null, "response": null} 

{"username": "xxx", "event": "session_start", "event_category": "session", "timestamp": "2019-12-11 12:53:32.350000+00:00", "context_ip": "xxx", "context_page_referrer": null, "context_page_url": "xxx", "context_page_search": null, "context_user_agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36", "context_data": null, "response": null}
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Currently my used props.conf is:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[xxx]
SHOULD_LINEMERGE=true
NO_BINARY_CHECK=true
CHARSET=UTF-8
SEDCMD-remove_prefix=s/\[//g
SEDCMD-remove_suffix=s/\]//g
SEDCMD-removeeventcommas=s/}, {"username":/}{"username":/g
BREAK_ONLY_BEFORE=\{\"username\"                              &amp;lt;-- This one is not working
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Output I am getting using above props.conf"&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;{"username": "xxx", "event": "session_start", "event_category": "session", "timestamp": "2019-12-11 08:26:23.547000+00:00", "context_ip": "xxx", "context_page_referrer": "xxx", "context_page_url": "xxx", "context_page_search": null, "context_user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36", "context_data": null, "response": null}{"username": "xxx", "event": "session_start", "event_category": "session", "timestamp": "2019-12-11 12:53:32.350000+00:00", "context_ip": "xxx", "context_page_referrer": null, "context_page_url": "xxx", "context_page_search": null, "context_user_agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36", "context_data": null, "response": null}
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;I am doing these validation while uploading sample log file from WebUI and during 2nd configuration page of Add Data I am doing this testing.&lt;/P&gt;

&lt;P&gt;What I am missing?&lt;/P&gt;</description>
      <pubDate>Fri, 20 Dec 2019 03:36:21 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Json-event-breaking-not-working-as-expected/m-p/468592#M80660</guid>
      <dc:creator>kishor_pinjarka</dc:creator>
      <dc:date>2019-12-20T03:36:21Z</dc:date>
    </item>
    <item>
      <title>Re: Json event breaking not working as expected</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Json-event-breaking-not-working-as-expected/m-p/468593#M80661</link>
      <description>&lt;P&gt;I'd recommend using explicit &lt;CODE&gt;LINE_BREAKER&lt;/CODE&gt; and &lt;CODE&gt;SHOULD_LINEMERGE=false&lt;/CODE&gt;. That is much more predictable and is also more performant.&lt;/P&gt;

&lt;P&gt;Something like this should work for your data:&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;LINE_BREAKER = ([\r\n]*\[|,\s+)\{"username":
SHOULD_LINEMERGE=false
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;This also automatically takes care of stripping the leading &lt;CODE&gt;[&lt;/CODE&gt; or &lt;CODE&gt;,&lt;/CODE&gt; in between records. Only SEDCMD needed is stripping of the trailing &lt;CODE&gt;]&lt;/CODE&gt;. See: &lt;A href="https://regex101.com/r/8zGyMS/1"&gt;https://regex101.com/r/8zGyMS/1&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;Note: SEDCMD applies after line breaking.&lt;/P&gt;</description>
      <pubDate>Fri, 20 Dec 2019 11:23:03 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Json-event-breaking-not-working-as-expected/m-p/468593#M80661</guid>
      <dc:creator>FrankVl</dc:creator>
      <dc:date>2019-12-20T11:23:03Z</dc:date>
    </item>
    <item>
      <title>Re: Json event breaking not working as expected</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Json-event-breaking-not-working-as-expected/m-p/468594#M80662</link>
      <description>&lt;P&gt;Use &lt;EM&gt;ONLY&lt;/EM&gt; this (do not add any of the stuff that I dropped back in):&lt;/P&gt;

&lt;PRE&gt;&lt;CODE&gt;[xxx]
SHOULD_LINEMERGE=false
LINE_BREAKER = ((?:(?:^|\][\r\n]+)\[)|,\s+)\{"username"
NO_BINARY_CHECK=true
CHARSET=UTF-8
SEDCMD-remove_suffix=s/]//g
&lt;/CODE&gt;&lt;/PRE&gt;

&lt;P&gt;Never, EVER use &lt;CODE&gt;SHOULD_LINEMERGE = true&lt;/CODE&gt; and the &lt;CODE&gt;BREAK_*&lt;/CODE&gt; junk.  I have only ever seen 1 time where it was necessary.&lt;/P&gt;</description>
      <pubDate>Fri, 20 Dec 2019 12:26:30 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Json-event-breaking-not-working-as-expected/m-p/468594#M80662</guid>
      <dc:creator>woodcock</dc:creator>
      <dc:date>2019-12-20T12:26:30Z</dc:date>
    </item>
  </channel>
</rss>

