<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: LINE_BREAKER in json files in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/LINE-BREAKER-in-json-files/m-p/672844#M230431</link>
    <description>&lt;P&gt;I recommend using the website &lt;A href="https://regex101.com/" target="_blank" rel="noopener"&gt;https://regex101.com/&lt;/A&gt; to test your regex and ensure it is definitely matching. When your regex is inserted, it does not seem to match the space character between "auditId" and the following colon (:)&lt;/P&gt;
&lt;P&gt;I would also recommend splitting the json events so that they have the curly brackets like so:&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;{

"event1keys" : "event1values",

....

}

{

"event2keys" : "event2values",

....

}&lt;/LI-CODE&gt;
&lt;P&gt;Thus your LINE_BREAKER value should also match the opening curly brace and its newline, and its first capture group should include the discardable characters between events such as commas&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;LINE_BREAKER=(,?[\r\n]+){\s*\"auditId\"&lt;/LI-CODE&gt;
&lt;P&gt;I also recommend setting SHOULD_LINEMERGE&amp;nbsp; to false to prevent Splunk from re-assembling multi-line events after the split.&lt;/P&gt;</description>
    <pubDate>Thu, 28 Dec 2023 16:59:11 GMT</pubDate>
    <dc:creator>marnall</dc:creator>
    <dc:date>2023-12-28T16:59:11Z</dc:date>
    <item>
      <title>LINE_BREAKER in json files</title>
      <link>https://community.splunk.com/t5/Splunk-Search/LINE-BREAKER-in-json-files/m-p/672843#M230430</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;
&lt;P&gt;Line breaker in my props configuration for the json formatted file is not working, it's not breaking the json events. My props and sample json events are giving below. Any recommendation will be highly appreciated, thank you!&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;props

[myprops]

CHARSET=UTF-8

KV_MODE-json

LINE_BREAKER=([\r\n]+)\"auditId\"\:

SHOULD_LINEMERGE=true

TIME_PREFIX="audittime": "

TIME_FORMAT=%Y-%m-%dT%H:%M:%S

TRUNCATE=9999



Sample Events

{
"items": [
{
"auditId" : 15067,
"secId": "mtt01",
"audittime": "2016-07-31T12:24:37Z",
"links": [
{
"name":"conanicaldba",
"href": "https://it.for.dev.com/opa-api"
},
{
"name":"describedbydba",
"href": "https://it.for.dev.com/opa-api/meta-data"
}
]
},
{
"auditId" : 16007,
"secId": "mtt01",
"audittime": "2016-07-31T12:23:47Z",
"links": [
{
"name":"conanicaldba",
"href": "https://it.for.dev.com/opa-api"
},
{
"name":"describedbydba",
"href": "https://it.for.dev.com/opa-api/meta-data"
}
]
},

{
"auditId" : 15165,
"secId": "mtt01",
"audittime": "2016-07-31T12:22:51Z",
"links": [
{
"name":"conanicaldba",
"href": "https://it.for.dev.com/opa-api"
},
{
"name":"describedbydba",
"href": "https://it.for.dev.com/opa-api/meta-data"
}
]
}
]&lt;/LI-CODE&gt;
&lt;P&gt;​&lt;/P&gt;</description>
      <pubDate>Thu, 28 Dec 2023 17:00:13 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/LINE-BREAKER-in-json-files/m-p/672843#M230430</guid>
      <dc:creator>SplunkDash</dc:creator>
      <dc:date>2023-12-28T17:00:13Z</dc:date>
    </item>
    <item>
      <title>Re: LINE_BREAKER in json files</title>
      <link>https://community.splunk.com/t5/Splunk-Search/LINE-BREAKER-in-json-files/m-p/672844#M230431</link>
      <description>&lt;P&gt;I recommend using the website &lt;A href="https://regex101.com/" target="_blank" rel="noopener"&gt;https://regex101.com/&lt;/A&gt; to test your regex and ensure it is definitely matching. When your regex is inserted, it does not seem to match the space character between "auditId" and the following colon (:)&lt;/P&gt;
&lt;P&gt;I would also recommend splitting the json events so that they have the curly brackets like so:&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;{

"event1keys" : "event1values",

....

}

{

"event2keys" : "event2values",

....

}&lt;/LI-CODE&gt;
&lt;P&gt;Thus your LINE_BREAKER value should also match the opening curly brace and its newline, and its first capture group should include the discardable characters between events such as commas&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;LINE_BREAKER=(,?[\r\n]+){\s*\"auditId\"&lt;/LI-CODE&gt;
&lt;P&gt;I also recommend setting SHOULD_LINEMERGE&amp;nbsp; to false to prevent Splunk from re-assembling multi-line events after the split.&lt;/P&gt;</description>
      <pubDate>Thu, 28 Dec 2023 16:59:11 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/LINE-BREAKER-in-json-files/m-p/672844#M230431</guid>
      <dc:creator>marnall</dc:creator>
      <dc:date>2023-12-28T16:59:11Z</dc:date>
    </item>
    <item>
      <title>Re: LINE_BREAKER in json files</title>
      <link>https://community.splunk.com/t5/Splunk-Search/LINE-BREAKER-in-json-files/m-p/672847#M230432</link>
      <description>&lt;P&gt;Testing this sample file on my local I think something like this could work.&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[ &amp;lt;SOURCETYPE NAME&amp;gt; ]
...
LINE_BREAKER=([\r\n]+)\s*\{\s*[\r\n]+\s*\"auditId\"
TIME_FORMAT=%Y-%m-%dT%H:%M:%S
TIME_PREFIX=(?:.*[\r\n]+)*\"audittime\":\s*\"
SEDCMD-remove_trailing_comma=s/\,$//g
SEDCMD-remove_trailing_bracket=s/\][\r\n]+$//g
TRANSFORMS-remove_header=remove_json_header&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;BR /&gt;This is a parsed event from the sampled file.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="dtburrows3_0-1703782108727.png" style="width: 400px;"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/28685i693813537FC0EE61/image-size/medium?v=v2&amp;amp;px=400" role="button" title="dtburrows3_0-1703782108727.png" alt="dtburrows3_0-1703782108727.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;I am getting a warning about the timestamp, but this is not because it is unable to find it but because the datetime exceeds my set limit for MAX_DAYS_AGO/MAX_DAYS_HENCE.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="dtburrows3_1-1703782180916.png" style="width: 400px;"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/28686i3B8CBC7BBB22C47A/image-size/medium?v=v2&amp;amp;px=400" role="button" title="dtburrows3_1-1703782180916.png" alt="dtburrows3_1-1703782180916.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Note the transform included in the props,&lt;BR /&gt;&lt;BR /&gt;This is needed to remove the first part of the json file that the events are nested in.&lt;BR /&gt;There will need to be an accompanying stanza in transforms.conf specifying regex used to regognize the event to send to null queue. It probably would look something like this.&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[remove_json_header]
REGEX = ^\s*\{\s*[\r\n]+\"items\":\s*\[
DEST_KEY = queue
FORMAT = nullQueue&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 28 Dec 2023 16:58:23 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/LINE-BREAKER-in-json-files/m-p/672847#M230432</guid>
      <dc:creator>dtburrows3</dc:creator>
      <dc:date>2023-12-28T16:58:23Z</dc:date>
    </item>
  </channel>
</rss>

