<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Event Splitting on Nested JSON in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/Event-Splitting-on-Nested-JSON/m-p/675979#M113100</link>
    <description>&lt;P&gt;I have JSON files which I am trying to event split as the JSON contains multiple events within each log. Here is an example of what the log would look like.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;{
  "vulnerability": [
    {
      "event": {
        "sub1": {
          "complexity": "LOW"
        },
        "sub2": {
          "complexity": "LOW"
        }
      },
      "id": "test",
      "description": "test",
      "state": "No Known",
      "risk_rating": "LOW",
      "sources": [
        {
          "date": "test"
        }
      ],
      "additional_info": [
        {
          "test": "test"
        }
      ],
      "was_edited": false
    },
    {
      "event": {
        "sub1": {
          "complexity": "LOW"
        },
        "sub2": {
          "complexity": "LOW"
        }
      },
      "id": "test",
      "description": "test",
      "state": "No Known",
      "risk_rating": "LOW",
      "sources": [
        {
          "date": "test"
        }
      ],
      "additional_info": [
        {
          "test": "test"
        }
      ],
      "was_edited": false
    }
  ],
  "next": "test",
  "total_count": 109465
}&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;In this example there would be two separate events that I need extracted out. I am essentially trying to pull out the event1 and event2 nests. Each log should have this same exact JSON format but there could be any number of events included in them.&amp;nbsp;&lt;/P&gt;&lt;P&gt;First event&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;    {
      "event": {
        "sub1": {
          "complexity": "LOW"
        },
        "sub2": {
          "complexity": "LOW"
        }
      },
      "id": "test",
      "description": "test",
      "state": "No Known",
      "risk_rating": "LOW",
      "sources": [
        {
          "date": "test"
        }
      ],
      "additional_info": [
        {
          "test": "test"
        }
      ],
      "was_edited": false
    }&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Second event&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;    {
      "event": {
        "sub1": {
          "complexity": "LOW"
        },
        "sub2": {
          "complexity": "LOW"
        }
      },
      "id": "test",
      "description": "test",
      "state": "No Known",
      "risk_rating": "LOW",
      "sources": [
        {
          "date": "test"
        }
      ],
      "additional_info": [
        {
          "test": "test"
        }
      ],
      "was_edited": false
    }&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I also want to exclude the opening&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;{
  "vulnerability": [&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;and closing&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;  ],
  "next": "test",
  "total_count": 109465
}&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;portions of the log files.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Am I missing something on how to set this sourcetype up? I have the following currently but that does not seem to be working&lt;/P&gt;&lt;P&gt;LINE_BREAKER =&amp;nbsp;\{(\r+|\n+|\t+|\s+)"event":&lt;/P&gt;</description>
    <pubDate>Wed, 31 Jan 2024 13:17:58 GMT</pubDate>
    <dc:creator>JakeInfoSec</dc:creator>
    <dc:date>2024-01-31T13:17:58Z</dc:date>
    <item>
      <title>Event Splitting on Nested JSON</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Event-Splitting-on-Nested-JSON/m-p/675979#M113100</link>
      <description>&lt;P&gt;I have JSON files which I am trying to event split as the JSON contains multiple events within each log. Here is an example of what the log would look like.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;{
  "vulnerability": [
    {
      "event": {
        "sub1": {
          "complexity": "LOW"
        },
        "sub2": {
          "complexity": "LOW"
        }
      },
      "id": "test",
      "description": "test",
      "state": "No Known",
      "risk_rating": "LOW",
      "sources": [
        {
          "date": "test"
        }
      ],
      "additional_info": [
        {
          "test": "test"
        }
      ],
      "was_edited": false
    },
    {
      "event": {
        "sub1": {
          "complexity": "LOW"
        },
        "sub2": {
          "complexity": "LOW"
        }
      },
      "id": "test",
      "description": "test",
      "state": "No Known",
      "risk_rating": "LOW",
      "sources": [
        {
          "date": "test"
        }
      ],
      "additional_info": [
        {
          "test": "test"
        }
      ],
      "was_edited": false
    }
  ],
  "next": "test",
  "total_count": 109465
}&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;In this example there would be two separate events that I need extracted out. I am essentially trying to pull out the event1 and event2 nests. Each log should have this same exact JSON format but there could be any number of events included in them.&amp;nbsp;&lt;/P&gt;&lt;P&gt;First event&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;    {
      "event": {
        "sub1": {
          "complexity": "LOW"
        },
        "sub2": {
          "complexity": "LOW"
        }
      },
      "id": "test",
      "description": "test",
      "state": "No Known",
      "risk_rating": "LOW",
      "sources": [
        {
          "date": "test"
        }
      ],
      "additional_info": [
        {
          "test": "test"
        }
      ],
      "was_edited": false
    }&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Second event&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;    {
      "event": {
        "sub1": {
          "complexity": "LOW"
        },
        "sub2": {
          "complexity": "LOW"
        }
      },
      "id": "test",
      "description": "test",
      "state": "No Known",
      "risk_rating": "LOW",
      "sources": [
        {
          "date": "test"
        }
      ],
      "additional_info": [
        {
          "test": "test"
        }
      ],
      "was_edited": false
    }&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I also want to exclude the opening&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;{
  "vulnerability": [&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;and closing&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;  ],
  "next": "test",
  "total_count": 109465
}&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;portions of the log files.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Am I missing something on how to set this sourcetype up? I have the following currently but that does not seem to be working&lt;/P&gt;&lt;P&gt;LINE_BREAKER =&amp;nbsp;\{(\r+|\n+|\t+|\s+)"event":&lt;/P&gt;</description>
      <pubDate>Wed, 31 Jan 2024 13:17:58 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Event-Splitting-on-Nested-JSON/m-p/675979#M113100</guid>
      <dc:creator>JakeInfoSec</dc:creator>
      <dc:date>2024-01-31T13:17:58Z</dc:date>
    </item>
    <item>
      <title>Re: Event Splitting on Nested JSON</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Event-Splitting-on-Nested-JSON/m-p/676009#M113101</link>
      <description>&lt;P&gt;This is simply bad data (at least from Splunk's point of view).&lt;/P&gt;&lt;P&gt;Even if you managed to break it into events (but I gotta honestly say that I see no way to &lt;STRONG&gt;reliably&lt;/STRONG&gt; make sure you break in proper places &lt;STRONG&gt;and only in those places&lt;/STRONG&gt;; manipulating structured data with just regexes is simply not reliable because regexes are not structure-aware), you'll still have those headers and footers (attached to an end of another event).&lt;/P&gt;&lt;P&gt;Also resulting events would have inconsistent contents - one event would have "event1" field, another would be "event2".&lt;/P&gt;&lt;P&gt;The best solution here would be to process your data and split &lt;STRONG&gt;before&lt;/STRONG&gt; pushing it to Splunk.&lt;/P&gt;</description>
      <pubDate>Wed, 31 Jan 2024 08:10:47 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Event-Splitting-on-Nested-JSON/m-p/676009#M113101</guid>
      <dc:creator>PickleRick</dc:creator>
      <dc:date>2024-01-31T08:10:47Z</dc:date>
    </item>
    <item>
      <title>Re: Event Splitting on Nested JSON</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Event-Splitting-on-Nested-JSON/m-p/676092#M113108</link>
      <description>&lt;P&gt;Try this&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;LINE_BREAKER = ([\r\n]+)\{[\s\S]+?event\d
SEDCMD-stripStart = s/\{[\s\S]+?"vulnerability":\s\[//
SEDCMD-stripEnd = s/\],[\s\S]+?"next": .*//&lt;/LI-CODE&gt;&lt;P&gt;The &lt;FONT face="courier new,courier"&gt;[\s\S]+?&lt;/FONT&gt; construct usually works best at matching embedded newlines.&lt;/P&gt;</description>
      <pubDate>Wed, 31 Jan 2024 15:32:46 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Event-Splitting-on-Nested-JSON/m-p/676092#M113108</guid>
      <dc:creator>richgalloway</dc:creator>
      <dc:date>2024-01-31T15:32:46Z</dc:date>
    </item>
    <item>
      <title>Re: Event Splitting on Nested JSON</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Event-Splitting-on-Nested-JSON/m-p/676158#M113111</link>
      <description>&lt;P&gt;Are you sure it will work with multiline events? I'm not 100% sure which regex flags are on with SEDCMD&lt;/P&gt;</description>
      <pubDate>Wed, 31 Jan 2024 18:57:51 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Event-Splitting-on-Nested-JSON/m-p/676158#M113111</guid>
      <dc:creator>PickleRick</dc:creator>
      <dc:date>2024-01-31T18:57:51Z</dc:date>
    </item>
    <item>
      <title>Re: Event Splitting on Nested JSON</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/Event-Splitting-on-Nested-JSON/m-p/676162#M113114</link>
      <description>&lt;P&gt;Yeah I tried out the LINE_BREAKER provided above but didn't seem to have any luck. No matter what I have tried I haven't been able to get it working as hoped. I think you're right in that the layout as is is just bad so I'm going to go back to the drawing board and try to change how the logs are formatted prior to hitting Splunk.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 31 Jan 2024 19:14:41 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/Event-Splitting-on-Nested-JSON/m-p/676162#M113114</guid>
      <dc:creator>JakeInfoSec</dc:creator>
      <dc:date>2024-01-31T19:14:41Z</dc:date>
    </item>
  </channel>
</rss>

