<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Why is my LINE_BREAKER parameter not breaking properly with multiple capture groups? in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-LINE-BREAKER-parameter-not-breaking-properly-with/m-p/679810#M113599</link>
    <description>&lt;P&gt;Dear splunk user,&lt;/P&gt;&lt;P&gt;using this &lt;STRONG&gt;sample&lt;/STRONG&gt; data&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[{"Field 859": "Value aaaaa", "Field 2": "Value bbbbb"}, {"Field 1": "Value ccccc", "Field 2": "Value ddddd"}, {"Field 1": "Value eeeee", "Field 2": "Value fffff"}]
[{"Field 759:" "Value ggggg", "Field 2": "Value hhhhh"}, {"Field 1": "Value iiiii", "Field 2": "Value jjjjj"}, {"Field 1": "Value kkkkk", "Field 2": "Value lllll"}]&lt;/LI-CODE&gt;&lt;P&gt;with this &lt;STRONG&gt;props.conf&lt;/STRONG&gt;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;[trbndrw_temp]
DATETIME_CONFIG = CURRENT
SHOULD_LINEMERGE = false
LINE_BREAKER = (?:\}(\s*,\s*)\{)|(\][\r\n]+\[)
TRANSFORMS-getrid = getridht&lt;/LI-CODE&gt;&lt;P&gt;and this &lt;STRONG&gt;transforms.conf&lt;/STRONG&gt;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;[getridht]
INGEST_EVAL = _raw=replace(_raw, "(\[|\])","")&lt;/LI-CODE&gt;&lt;P&gt;you may be able to achieve what you want&lt;BR /&gt;&lt;BR /&gt;Happy splunking&lt;BR /&gt;Luca (aka "one &lt;EM&gt;DASH&lt;/EM&gt;&amp;nbsp;is always better")&lt;/P&gt;</description>
    <pubDate>Wed, 06 Mar 2024 15:16:57 GMT</pubDate>
    <dc:creator>lucacaldiero</dc:creator>
    <dc:date>2024-03-06T15:16:57Z</dc:date>
    <item>
      <title>Why is my LINE_BREAKER parameter not breaking properly with multiple capture groups?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-LINE-BREAKER-parameter-not-breaking-properly-with/m-p/679636#M113575</link>
      <description>&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;P&gt;Hello, I need help with perfecting a sourcetype that doesn't index my json files correctly when I am defining multiple capture groups within the LINE_BREAKER parameter.&lt;/P&gt;&lt;P&gt;I'm using this other questionto try to figure out how to make it work:&amp;nbsp;&lt;A href="https://community.splunk.com/t5/Getting-Data-In/How-to-handle-LINE-BREAKER-regex-for-multiple-capture-groups/m-p/291996" target="_blank"&gt;https://community.splunk.com/t5/Getting-Data-In/How-to-handle-LINE-BREAKER-regex-for-multiple-capture-groups/m-p/291996&lt;/A&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In my case my json looks like this&lt;/P&gt;&lt;P&gt;[{"Field 1": "Value 1", "Field N": "Value N"}, {"Field 1": "Value 1", "Field N": "Value N"}, {"Field 1": "Value 1", "Field N": "Value N"}]&lt;/P&gt;&lt;P&gt;Initially I tried:&lt;/P&gt;&lt;P&gt;LINE_BREAKER = }(,\s){&lt;/P&gt;&lt;P&gt;Which split the events with the exception of the first and last records which were not indexed correctly due to the "[" or "]" characters leading and trailing the payload.&lt;/P&gt;&lt;P&gt;After many attempts I have been unable to make it work, but based on what I've read this seems to be the most intuitive solution for defining the capture groups:&lt;/P&gt;&lt;P&gt;LINE_BREAKER = ^([){|}(,\s){|}(])$&lt;/P&gt;&lt;P&gt;It doesn't work, but rather indexes the entire payload as one event, formatted correctly, but unusable.&lt;/P&gt;&lt;P&gt;Could somebody please suggest how to correctly define the LINE_BREAKER parameter for the sourcetype?&amp;nbsp; Here is the full version I'm using:&lt;/P&gt;&lt;P&gt;[area:prd:json]&lt;BR /&gt;SHOULD_LINEMERGE = false&lt;BR /&gt;TRUNCATE = 8388608&lt;BR /&gt;TIME_PREFIX = \"Updated\sdate\"\:\s\"&lt;BR /&gt;TIME_FORMAT = %Y-%m-%d %H:%M:%S&lt;BR /&gt;TZ = Europe/Paris&lt;BR /&gt;MAX_TIMESTAMP_LOOKAHEAD = -1&lt;BR /&gt;KV_MODE = json&lt;BR /&gt;LINE_BREAKER = ^([){|}(,\s){|}(])$&lt;/P&gt;&lt;P&gt;Other resolutions to my problem are welcome as well!&lt;/P&gt;&lt;P&gt;Best regards,&lt;/P&gt;&lt;P&gt;Andrew&lt;/P&gt;</description>
      <pubDate>Tue, 05 Mar 2024 19:09:21 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-LINE-BREAKER-parameter-not-breaking-properly-with/m-p/679636#M113575</guid>
      <dc:creator>andrewtrobec</dc:creator>
      <dc:date>2024-03-05T19:09:21Z</dc:date>
    </item>
    <item>
      <title>Re: Why is my LINE_BREAKER parameter not breaking properly with multiple capture groups?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-LINE-BREAKER-parameter-not-breaking-properly-with/m-p/679746#M113583</link>
      <description>Hi&lt;BR /&gt;Based on your TIME_PREFIX, your example is not complete sample! If you want that we help you, we really need the whole example json/file.&lt;BR /&gt;r. Ismo</description>
      <pubDate>Wed, 06 Mar 2024 09:52:45 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-LINE-BREAKER-parameter-not-breaking-properly-with/m-p/679746#M113583</guid>
      <dc:creator>isoutamo</dc:creator>
      <dc:date>2024-03-06T09:52:45Z</dc:date>
    </item>
    <item>
      <title>Re: Why is my LINE_BREAKER parameter not breaking properly with multiple capture groups?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-LINE-BREAKER-parameter-not-breaking-properly-with/m-p/679767#M113590</link>
      <description>&lt;P&gt;Thank you&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/214410"&gt;@isoutamo&lt;/a&gt; for the response.&amp;nbsp; Here is more accurate version of payload&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[
    {
        "Assigned to": "Jones, Francis",
        "Cost": 3,
        "Created date": "2024-02-28 12:52:18",
        "Extraction date": "2024-03-02 13:51:00",
        "ID": 12345,
        "Initial Cost": 3,
        "Location": "Sites",
        "Path": "Sites\\FY1\\S3",
        "Priority": 1,
        "State": "In Progress",
        "Status Change date": "2024-03-05 16:33:23",
        "Tags": "Europe; Finance",
        "Title": "Ensure correct routing of orders",
        "Updated date": "2024-03-05 16:33:23",
        "Warranty": false,
        "Wave Quarter": "Q2 22",
        "Work Item Type": "Request"
    },
    {
        "Assigned to": "Jones, Francis",
        "Cost": 3,
        "Created date": "2024-02-28 18:59:18",
        "Extraction date": "2024-03-05 16:31:00",
        "ID": 12345,
        "Initial Cost": 3,
        "Location": "Sites",
        "Path": "Sites\\FY1\\S3",
        "Priority": 1,
        "State": "In Progress",
        "Status Change date": "2024-03-05 16:33:23",
        "Tags": "Europe; Finance",
        "Title": "Ensure correct routing of orders",
        "Updated date": "2024-03-05 16:33:23",
        "Warranty": false,
        "Wave Quarter": "Q2 22",
        "Work Item Type": "Request"
    },
    {
        "Assigned to": "Jones, Francis",
        "Cost": 3,
        "Created date": "2023-01-28 18:59:18",
        "Extraction date": "2023-02-05 16:31:00",
        "ID": 12345,
        "Initial Cost": 3,
        "Location": "Sites",
        "Path": "Sites\\FY1\\S3",
        "Priority": 1,
        "State": "In Progress",
        "Status Change date": "2023-02-05 16:33:23",
        "Tags": "Europe; Finance",
        "Title": "Ensure correct routing of orders",
        "Updated date": "2024-03-05 16:33:23",
        "Warranty": false,
        "Wave Quarter": "Q2 22",
        "Work Item Type": "Request"
    }
]&lt;/LI-CODE&gt;</description>
      <pubDate>Wed, 06 Mar 2024 11:00:42 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-LINE-BREAKER-parameter-not-breaking-properly-with/m-p/679767#M113590</guid>
      <dc:creator>andrewtrobec</dc:creator>
      <dc:date>2024-03-06T11:00:42Z</dc:date>
    </item>
    <item>
      <title>Re: Why is my LINE_BREAKER parameter not breaking properly with multiple capture groups?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-LINE-BREAKER-parameter-not-breaking-properly-with/m-p/679779#M113592</link>
      <description>&lt;P&gt;Thanks.&lt;/P&gt;&lt;P&gt;This seems to work&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;LINE_BREAKER = (\[[\s\n\r]*\{|\},[\s\n\r]+\{|\}[\s\n\r]*)&lt;/LI-CODE&gt;&lt;P&gt;Why your regex doesn't work?&lt;/P&gt;&lt;P&gt;Splunk need only one capture group for line beak. &amp;nbsp;You have three separate groups even you have try to make those selectable by |. &amp;nbsp;You also need to escape some of those marks (like [{]} to recognise as a character). You can test this with&amp;nbsp;&lt;A href="https://regex101.com/r/IGQHd7/1" target="_blank"&gt;https://regex101.com/r/IGQHd7/1&lt;/A&gt;&lt;/P&gt;&lt;P&gt;When I test these I use just regex101.com and/or Splunk GUI -&amp;gt; Settings -&amp;gt; Import Data -&amp;gt; Upload with example file on my own laptop/workstation/dev server. In that way it's easy to change those values and check how those are affecting.&lt;/P&gt;&lt;P&gt;You should also change&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;MAX_TIMESTAMP_LOOKAHEAD = 20&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;As you define TIMESTAMP_PREFIX there is no reason to use -1 as its lookahead value. Splunk starts to look it after defined prefix and as you can see correct timestamp is within 20 character after it.&lt;/P&gt;&lt;P&gt;Why you have set KV_MODE=json? As you have break this json into separate events, it's not anymore json as a format. Now it's just regular text based event.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 06 Mar 2024 11:58:47 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-LINE-BREAKER-parameter-not-breaking-properly-with/m-p/679779#M113592</guid>
      <dc:creator>isoutamo</dc:creator>
      <dc:date>2024-03-06T11:58:47Z</dc:date>
    </item>
    <item>
      <title>Re: Why is my LINE_BREAKER parameter not breaking properly with multiple capture groups?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-LINE-BREAKER-parameter-not-breaking-properly-with/m-p/679810#M113599</link>
      <description>&lt;P&gt;Dear splunk user,&lt;/P&gt;&lt;P&gt;using this &lt;STRONG&gt;sample&lt;/STRONG&gt; data&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[{"Field 859": "Value aaaaa", "Field 2": "Value bbbbb"}, {"Field 1": "Value ccccc", "Field 2": "Value ddddd"}, {"Field 1": "Value eeeee", "Field 2": "Value fffff"}]
[{"Field 759:" "Value ggggg", "Field 2": "Value hhhhh"}, {"Field 1": "Value iiiii", "Field 2": "Value jjjjj"}, {"Field 1": "Value kkkkk", "Field 2": "Value lllll"}]&lt;/LI-CODE&gt;&lt;P&gt;with this &lt;STRONG&gt;props.conf&lt;/STRONG&gt;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;[trbndrw_temp]
DATETIME_CONFIG = CURRENT
SHOULD_LINEMERGE = false
LINE_BREAKER = (?:\}(\s*,\s*)\{)|(\][\r\n]+\[)
TRANSFORMS-getrid = getridht&lt;/LI-CODE&gt;&lt;P&gt;and this &lt;STRONG&gt;transforms.conf&lt;/STRONG&gt;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;[getridht]
INGEST_EVAL = _raw=replace(_raw, "(\[|\])","")&lt;/LI-CODE&gt;&lt;P&gt;you may be able to achieve what you want&lt;BR /&gt;&lt;BR /&gt;Happy splunking&lt;BR /&gt;Luca (aka "one &lt;EM&gt;DASH&lt;/EM&gt;&amp;nbsp;is always better")&lt;/P&gt;</description>
      <pubDate>Wed, 06 Mar 2024 15:16:57 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-LINE-BREAKER-parameter-not-breaking-properly-with/m-p/679810#M113599</guid>
      <dc:creator>lucacaldiero</dc:creator>
      <dc:date>2024-03-06T15:16:57Z</dc:date>
    </item>
    <item>
      <title>Re: Why is my LINE_BREAKER parameter not breaking properly with multiple capture groups?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-LINE-BREAKER-parameter-not-breaking-properly-with/m-p/679844#M113604</link>
      <description>&lt;P&gt;Thanks Luca, this works!&amp;nbsp; Appreciated!&lt;/P&gt;</description>
      <pubDate>Wed, 06 Mar 2024 17:16:05 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-LINE-BREAKER-parameter-not-breaking-properly-with/m-p/679844#M113604</guid>
      <dc:creator>andrewtrobec</dc:creator>
      <dc:date>2024-03-06T17:16:05Z</dc:date>
    </item>
    <item>
      <title>Re: Why is my LINE_BREAKER parameter not breaking properly with multiple capture groups?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-LINE-BREAKER-parameter-not-breaking-properly-with/m-p/679845#M113605</link>
      <description>&lt;P&gt;Thank you for the feedback!&amp;nbsp; I will take your suggestions into consideration!&lt;/P&gt;</description>
      <pubDate>Wed, 06 Mar 2024 17:17:25 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Why-is-my-LINE-BREAKER-parameter-not-breaking-properly-with/m-p/679845#M113605</guid>
      <dc:creator>andrewtrobec</dc:creator>
      <dc:date>2024-03-06T17:17:25Z</dc:date>
    </item>
  </channel>
</rss>

