<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How do we extract multi-value fields from nested JSON? in Getting Data In</title>
    <link>https://community.splunk.com/t5/Getting-Data-In/How-do-we-extract-multi-value-fields-from-nested-JSON/m-p/672234#M112611</link>
    <description>&lt;P&gt;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/263242"&gt;@dtburrows3&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you for the reply.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Tried these eval and the fields are getting extracted from the tuples, but it seems the association between them is lost.&amp;nbsp;&lt;/P&gt;&lt;P&gt;For this one event, there are total 17 tuples. But after applying evals, resulting stats shows several other combinations between src_ip &amp;amp; dst_ip.&lt;/P&gt;&lt;P&gt;Stats for field&amp;nbsp;records{}.properties.flows{}.flows{}.flowTuples{}&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="flow_tuples.png" style="width: 509px;"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/28578i5D2E2FB2D97F3D4B/image-size/large?v=v2&amp;amp;px=999" role="button" title="flow_tuples.png" alt="flow_tuples.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;stats on src_ip,dst_ip after applying eval&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="flow_tuple_afterextraction.png" style="width: 999px;"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/28580iC230F66505C06AFA/image-size/large?v=v2&amp;amp;px=999" role="button" title="flow_tuple_afterextraction.png" alt="flow_tuple_afterextraction.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
    <pubDate>Tue, 19 Dec 2023 01:02:56 GMT</pubDate>
    <dc:creator>att35</dc:creator>
    <dc:date>2023-12-19T01:02:56Z</dc:date>
    <item>
      <title>How do we extract multi-value fields from nested JSON?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-do-we-extract-multi-value-fields-from-nested-JSON/m-p/672213#M112609</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;We are ingesting Azure NSG flow logs and visualizing them using app Microsoft Azure App for Splunk &lt;A href="https://splunkbase.splunk.com/app/4882" target="_blank" rel="noopener"&gt;https://splunkbase.splunk.com/app/4882&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Data is in JSON format with multiple levels/records in a single event. Each record can have multiple flows, flow tuples etc. Adding few screenshots here to give the context.&lt;/P&gt;&lt;P&gt;Default extractions for the main JSON fields look fine. But when it comes to values within the flow tuple field, i.e. records{}.properties.flows{}.flows{}.flowTuples{}, Splunk only keeps values from the very first entry.&lt;/P&gt;&lt;P&gt;How can I make these src_ip, dest_ip fields also get multiple values(across all records/flow tuples etc)&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="flowlogs_records.png" style="width: 362px;"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/28573i5CE3980480B28B96/image-size/large?v=v2&amp;amp;px=999" role="button" title="flowlogs_records.png" alt="flowlogs_records.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Splunk extracts values only from that first highlighted entry" style="width: 800px;"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/28575iE815E7BD182B7EB4/image-size/large?v=v2&amp;amp;px=999" role="button" title="flowlogs_tuples.png" alt="Splunk extracts values only from that first highlighted entry" /&gt;&lt;span class="lia-inline-image-caption" onclick="event.preventDefault();"&gt;Splunk extracts values only from that first highlighted entry&lt;/span&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Here is the extraction logic from this app.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;[extract_tuple]
SOURCE_KEY = records{}.properties.flows{}.flows{}.flowTuples{}
DELIMS = ","
FIELDS = time,src_ip,dst_ip,src_port,dst_port,protocol,traffic_flow,traffic_result&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;</description>
      <pubDate>Mon, 18 Dec 2023 19:28:15 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-do-we-extract-multi-value-fields-from-nested-JSON/m-p/672213#M112609</guid>
      <dc:creator>att35</dc:creator>
      <dc:date>2023-12-18T19:28:15Z</dc:date>
    </item>
    <item>
      <title>Re: How do we extract multi-value fields from nested JSON?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-do-we-extract-multi-value-fields-from-nested-JSON/m-p/672216#M112610</link>
      <description>&lt;P&gt;You can give these evals a go. I would check and make sure you are getting everything properly as expected.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;I don't have access to any sourcetype="mscs:nsg:flow" data at the moment so I just am using simulated data based off of your screenshots.&lt;BR /&gt;&lt;BR /&gt;If you are happy with the output then you could add them as calculated fields in local/props.conf (I would make sure that they don't step on any existing knowledge object in the app though)&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| eval time=if(isnotnull('records{}.properties.flows{}.flows{}.flowTuples{}'), case(mvcount('records{}.properties.flows{}.flows{}.flowTuples{}')==1, mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 0), mvcount('records{}.properties.flows{}.flows{}.flowTuples{}')&amp;gt;1, mvmap('records{}.properties.flows{}.flows{}.flowTuples{}', mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 0))), 'time')
    | eval src_ip=if(isnotnull('records{}.properties.flows{}.flows{}.flowTuples{}'), case(mvcount('records{}.properties.flows{}.flows{}.flowTuples{}')==1, mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 1), mvcount('records{}.properties.flows{}.flows{}.flowTuples{}')&amp;gt;1, mvmap('records{}.properties.flows{}.flows{}.flowTuples{}', mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 1))), 'src_ip')
    | eval dst_ip=if(isnotnull('records{}.properties.flows{}.flows{}.flowTuples{}'), case(mvcount('records{}.properties.flows{}.flows{}.flowTuples{}')==1, mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 2), mvcount('records{}.properties.flows{}.flows{}.flowTuples{}')&amp;gt;1, mvmap('records{}.properties.flows{}.flows{}.flowTuples{}', mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 2))), 'dst_ip')
    | eval src_port=if(isnotnull('records{}.properties.flows{}.flows{}.flowTuples{}'), case(mvcount('records{}.properties.flows{}.flows{}.flowTuples{}')==1, mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 3), mvcount('records{}.properties.flows{}.flows{}.flowTuples{}')&amp;gt;1, mvmap('records{}.properties.flows{}.flows{}.flowTuples{}', mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 3))), 'src_port')
    | eval dst_port=if(isnotnull('records{}.properties.flows{}.flows{}.flowTuples{}'), case(mvcount('records{}.properties.flows{}.flows{}.flowTuples{}')==1, mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 4), mvcount('records{}.properties.flows{}.flows{}.flowTuples{}')&amp;gt;1, mvmap('records{}.properties.flows{}.flows{}.flowTuples{}', mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 4))), 'dst_port')
    | eval protocol=if(isnotnull('records{}.properties.flows{}.flows{}.flowTuples{}'), case(mvcount('records{}.properties.flows{}.flows{}.flowTuples{}')==1, mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 5), mvcount('records{}.properties.flows{}.flows{}.flowTuples{}')&amp;gt;1, mvmap('records{}.properties.flows{}.flows{}.flowTuples{}', mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 5))), 'protocol')
    | eval traffic_flow=if(isnotnull('records{}.properties.flows{}.flows{}.flowTuples{}'), case(mvcount('records{}.properties.flows{}.flows{}.flowTuples{}')==1, mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 6), mvcount('records{}.properties.flows{}.flows{}.flowTuples{}')&amp;gt;1, mvmap('records{}.properties.flows{}.flows{}.flowTuples{}', mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 6))), 'traffic_flow')
    | eval traffic_result=if(isnotnull('records{}.properties.flows{}.flows{}.flowTuples{}'), case(mvcount('records{}.properties.flows{}.flows{}.flowTuples{}')==1, mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 7), mvcount('records{}.properties.flows{}.flows{}.flowTuples{}')&amp;gt;1, mvmap('records{}.properties.flows{}.flows{}.flowTuples{}', mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 7))), 'traffic_result')&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;BR /&gt;Also, not sure if there are ever events formatted slightly differently because only a single flow occurred and it would no longer be an array in the json event, therefore changing the overall extracted field name to something like "records{}.properties.flows{}.flows.flowTuples{}". From the look at the microsoft_azure app configs, it looks like its only every referencing "records{}.properties.flows{}.flows{}.flowTuples{}" for it's extractions so I just made the assumption that events will be formatted this way.&lt;/P&gt;</description>
      <pubDate>Mon, 18 Dec 2023 21:24:51 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-do-we-extract-multi-value-fields-from-nested-JSON/m-p/672216#M112610</guid>
      <dc:creator>dtburrows3</dc:creator>
      <dc:date>2023-12-18T21:24:51Z</dc:date>
    </item>
    <item>
      <title>Re: How do we extract multi-value fields from nested JSON?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-do-we-extract-multi-value-fields-from-nested-JSON/m-p/672234#M112611</link>
      <description>&lt;P&gt;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/263242"&gt;@dtburrows3&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you for the reply.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Tried these eval and the fields are getting extracted from the tuples, but it seems the association between them is lost.&amp;nbsp;&lt;/P&gt;&lt;P&gt;For this one event, there are total 17 tuples. But after applying evals, resulting stats shows several other combinations between src_ip &amp;amp; dst_ip.&lt;/P&gt;&lt;P&gt;Stats for field&amp;nbsp;records{}.properties.flows{}.flows{}.flowTuples{}&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="flow_tuples.png" style="width: 509px;"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/28578i5D2E2FB2D97F3D4B/image-size/large?v=v2&amp;amp;px=999" role="button" title="flow_tuples.png" alt="flow_tuples.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;stats on src_ip,dst_ip after applying eval&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="flow_tuple_afterextraction.png" style="width: 999px;"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/28580iC230F66505C06AFA/image-size/large?v=v2&amp;amp;px=999" role="button" title="flow_tuple_afterextraction.png" alt="flow_tuple_afterextraction.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 19 Dec 2023 01:02:56 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-do-we-extract-multi-value-fields-from-nested-JSON/m-p/672234#M112611</guid>
      <dc:creator>att35</dc:creator>
      <dc:date>2023-12-19T01:02:56Z</dc:date>
    </item>
    <item>
      <title>Re: How do we extract multi-value fields from nested JSON?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-do-we-extract-multi-value-fields-from-nested-JSON/m-p/672236#M112612</link>
      <description>&lt;P&gt;To retain the associations for any sort of analysis you may need to mvexpand the "records{}.properties.flows{}.flows{}.flowTuples{}" field itself.&lt;BR /&gt;&lt;BR /&gt;stats aggregation using 2 multivalued fields as by-fields can be misleading for the final output.&lt;BR /&gt;&lt;BR /&gt;Below is a table of the event you shared on the initial post after using the mvexpand and then extracting out the individual fields after.&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="dtburrows3_0-1702948883195.png" style="width: 400px;"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/28581iBE0275FB4DD78721/image-size/medium?v=v2&amp;amp;px=400" role="button" title="dtburrows3_0-1702948883195.png" alt="dtburrows3_0-1702948883195.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;SPL to do this&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| mvexpand "records{}.properties.flows{}.flows{}.flowTuples{}"
    | eval 
        time=mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 0),
        src_ip=mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 1),
        dest_ip=mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 2),
        src_port=mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 3),
        dest_port=mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 4),
        protocol=mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 5),
        traffic_flow=mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 6),
        traffic_result=mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 7)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Doing a stats count by src_ip and dst_ip should make more sense using the data formatted in this way.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 19 Dec 2023 01:28:25 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-do-we-extract-multi-value-fields-from-nested-JSON/m-p/672236#M112612</guid>
      <dc:creator>dtburrows3</dc:creator>
      <dc:date>2023-12-19T01:28:25Z</dc:date>
    </item>
    <item>
      <title>Re: How do we extract multi-value fields from nested JSON?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-do-we-extract-multi-value-fields-from-nested-JSON/m-p/672242#M112613</link>
      <description>&lt;P&gt;Thanks&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/263242"&gt;@dtburrows3&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This method worked perfectly. Able to extract the required fields while still keeping associations intact.&amp;nbsp;&lt;/P&gt;&lt;P&gt;although running this at scale, I am getting the following message.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&amp;nbsp;command.mvexpand: output will be truncated at 2200 results due to excessive memory usage. Memory threshold of 500MB as configured in limits.conf / [mvexpand] / max_mem_usage_mb has been reached.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;Are there any alternatives to mvexpand that would avoid these memory issues?&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 19 Dec 2023 03:27:23 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-do-we-extract-multi-value-fields-from-nested-JSON/m-p/672242#M112613</guid>
      <dc:creator>att35</dc:creator>
      <dc:date>2023-12-19T03:27:23Z</dc:date>
    </item>
    <item>
      <title>Re: How do we extract multi-value fields from nested JSON?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-do-we-extract-multi-value-fields-from-nested-JSON/m-p/672243#M112614</link>
      <description>&lt;P&gt;yea unfortunately mvexpand can be memory intensive.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;I would say limit your fieldset as much as possible before using it and see if that helps.&lt;BR /&gt;&lt;BR /&gt;It actually may work to just do a,&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;&amp;lt;base_search&amp;gt;
    | stats count by "records{}.properties.flows{}.flows{}.flowTuples{}"
    | eval 
        time=mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 0),
        src_ip=mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 1),
        dst_ip=mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 2),
        src_port=mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 3),
        dst_port=mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 4),
        protocol=mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 5),
        traffic_flow=mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 6),
        traffic_result=mvindex(split('records{}.properties.flows{}.flows{}.flowTuples{}', ","), 7)
    | stats
        sum(count) as total
            by src_ip, dst_ip&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;this should tally up all the individual flow_tuples from events and then we can eval to split it out and then sum it all up by src, dest IP.&lt;BR /&gt;&lt;BR /&gt;I think this get around the need for an MVexpand.&lt;BR /&gt;&lt;BR /&gt;Let me know if that works!&lt;/P&gt;</description>
      <pubDate>Tue, 19 Dec 2023 03:50:03 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-do-we-extract-multi-value-fields-from-nested-JSON/m-p/672243#M112614</guid>
      <dc:creator>dtburrows3</dc:creator>
      <dc:date>2023-12-19T03:50:03Z</dc:date>
    </item>
    <item>
      <title>Re: How do we extract multi-value fields from nested JSON?</title>
      <link>https://community.splunk.com/t5/Getting-Data-In/How-do-we-extract-multi-value-fields-from-nested-JSON/m-p/672300#M112621</link>
      <description>&lt;P&gt;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/263242"&gt;@dtburrows3&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you!!&lt;/P&gt;&lt;P&gt;This worked perfectly. No memory issues either.&lt;/P&gt;&lt;P&gt;Do you know if there is a way to apply these using props/transforms or are these strictly in-line search time transformations?&lt;/P&gt;</description>
      <pubDate>Tue, 19 Dec 2023 12:54:53 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Getting-Data-In/How-do-we-extract-multi-value-fields-from-nested-JSON/m-p/672300#M112621</guid>
      <dc:creator>att35</dc:creator>
      <dc:date>2023-12-19T12:54:53Z</dc:date>
    </item>
  </channel>
</rss>

