<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Flatten JSON associative array with variable key names that even containing dots in Splunk Search</title>
    <link>https://community.splunk.com/t5/Splunk-Search/Flatten-JSON-associative-array-with-variable-key-names-that-even/m-p/650246#M224795</link>
    <description>&lt;P&gt;I have JSON event data like this (it is shown as a collapsable tree structure in the event view):&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;{
    "data": {
        "192.168.1.1": {
            "ip": "192.168.1.1",
            "number": 0,
            "list": [
                {
                    "msg": "msg1",
                    "time": "2023-07-01T01:00:00",
                },
                {
                    "msg": "msg2",
                    "time": "2023-07-01T02:00:00",
                },
                {
                    "msg": "msg3",
                    "time": "2023-07-01T03:00:00",
                }
            ]
        },
        "192.168.1.2": {
            "ip": "192.168.1.2",
            "number": 2,
            "list": [
                {
                    "msg": "msg1",
                    "time": "2023-07-02T01:00:00",
                },
                {
                    "msg": "msg2",
                    "time": "2023-07-02T02:00:00",
                }
            ]
        }
    }
}&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;Please note:&amp;nbsp;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;The key names under "data" are not known beforehand, but they are guaranteed to be IP addresses. That means they contain dots, which makes direct addressing such as "data.192.168.1.1.ip" difficult.&lt;/LI&gt;&lt;LI&gt;The number of entries in "data.X.list" is variable.&lt;/LI&gt;&lt;LI&gt;"data.X.number" is just any number, it does not contain the length of the list or so.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;I want to flatten the structure into a tabular form, like this:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;TABLE border="1" width="100%"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD width="25%"&gt;&lt;STRONG&gt;ip&lt;/STRONG&gt;&lt;/TD&gt;&lt;TD width="25%"&gt;&lt;STRONG&gt;number&lt;/STRONG&gt;&lt;/TD&gt;&lt;TD width="25%"&gt;&lt;STRONG&gt;msg&lt;/STRONG&gt;&lt;/TD&gt;&lt;TD width="25%"&gt;&lt;STRONG&gt;time&lt;/STRONG&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="25%"&gt;"192.168.1.1"&lt;/TD&gt;&lt;TD width="25%"&gt;0&lt;/TD&gt;&lt;TD width="25%"&gt;"msg1"&lt;/TD&gt;&lt;TD width="25%"&gt;"2023-07-01T01:00:00"&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="25%"&gt;"192.168.1.1"&lt;/TD&gt;&lt;TD width="25%"&gt;0&lt;/TD&gt;&lt;TD width="25%"&gt;"msg2"&lt;/TD&gt;&lt;TD width="25%"&gt;"2023-07-01T02:00:00"&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="25%"&gt;"192.168.1.1"&lt;/TD&gt;&lt;TD width="25%"&gt;0&lt;/TD&gt;&lt;TD width="25%"&gt;"msg3"&lt;/TD&gt;&lt;TD width="25%"&gt;"2023-07-01T03:00:00"&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;"192.168.1.2"&lt;/TD&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;TD&gt;"msg1"&lt;/TD&gt;&lt;TD&gt;"2023-07-02T01:00:00"&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;"192.168.1.2"&lt;/TD&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;TD&gt;"msg1"&lt;/TD&gt;&lt;TD&gt;"2023-07-02T02:00:00"&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My strategy so far was:&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;In the raw data (_raw), replace all dots in IP addresses by underscores, to avoid the dot notation hassle&lt;/LI&gt;&lt;LI&gt;Then use foreach to generate "iterator variables" for each ip entry&lt;/LI&gt;&lt;LI&gt;Then iterate over all "iterator variables", use &amp;lt;&amp;lt;MATCHSTR&amp;gt;&amp;gt; as a placeholder for all spath operations within each IP address's sub-tree.&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;Something along the lines of&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| rex field=_raw mode=sed "s/192.168.([0-9]{1,3}).([0-9]{1,3})/192_168_\1_\2/g"
| foreach data.*.ip [ eval iterator_&amp;lt;&amp;lt;MATCHSTR&amp;gt;&amp;gt;='&amp;lt;&amp;lt;FIELD&amp;gt;&amp;gt;']
| foreach iterator_* [
    spath path=data.&amp;lt;&amp;lt;MATCHSTR&amp;gt;&amp;gt;.list{} output=&amp;lt;&amp;lt;MATCHSTR&amp;gt;&amp;gt;_json
    | eval &amp;lt;&amp;lt;MATCHSTR&amp;gt;&amp;gt;_json=mvmap(&amp;lt;&amp;lt;MATCHSTR&amp;gt;&amp;gt;_json, &amp;lt;&amp;lt;MATCHSTR&amp;gt;&amp;gt;_json."##&amp;lt;&amp;lt;MATCHSTR&amp;gt;&amp;gt;")
    | eval messages=mvappend(messages, &amp;lt;&amp;lt;MATCHSTR&amp;gt;&amp;gt;_json)
    | fields - &amp;lt;&amp;lt;MATCHSTR&amp;gt;&amp;gt;_json ]&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;But the problems start with the fact that rex applied to _raw does not seem to have the desired effect. The closest I get are iterator variables still containing dotted IP addresses, such as "iterator_192.168.1.1". (This behaviour might be difficult to reproduce with makeresults sample data!)&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;What am I missing here?&lt;/P&gt;</description>
    <pubDate>Wed, 12 Jul 2023 16:13:29 GMT</pubDate>
    <dc:creator>rikinet</dc:creator>
    <dc:date>2023-07-12T16:13:29Z</dc:date>
    <item>
      <title>Flatten JSON associative array with variable key names that even containing dots</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Flatten-JSON-associative-array-with-variable-key-names-that-even/m-p/650246#M224795</link>
      <description>&lt;P&gt;I have JSON event data like this (it is shown as a collapsable tree structure in the event view):&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;{
    "data": {
        "192.168.1.1": {
            "ip": "192.168.1.1",
            "number": 0,
            "list": [
                {
                    "msg": "msg1",
                    "time": "2023-07-01T01:00:00",
                },
                {
                    "msg": "msg2",
                    "time": "2023-07-01T02:00:00",
                },
                {
                    "msg": "msg3",
                    "time": "2023-07-01T03:00:00",
                }
            ]
        },
        "192.168.1.2": {
            "ip": "192.168.1.2",
            "number": 2,
            "list": [
                {
                    "msg": "msg1",
                    "time": "2023-07-02T01:00:00",
                },
                {
                    "msg": "msg2",
                    "time": "2023-07-02T02:00:00",
                }
            ]
        }
    }
}&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;Please note:&amp;nbsp;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;The key names under "data" are not known beforehand, but they are guaranteed to be IP addresses. That means they contain dots, which makes direct addressing such as "data.192.168.1.1.ip" difficult.&lt;/LI&gt;&lt;LI&gt;The number of entries in "data.X.list" is variable.&lt;/LI&gt;&lt;LI&gt;"data.X.number" is just any number, it does not contain the length of the list or so.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;I want to flatten the structure into a tabular form, like this:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;TABLE border="1" width="100%"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD width="25%"&gt;&lt;STRONG&gt;ip&lt;/STRONG&gt;&lt;/TD&gt;&lt;TD width="25%"&gt;&lt;STRONG&gt;number&lt;/STRONG&gt;&lt;/TD&gt;&lt;TD width="25%"&gt;&lt;STRONG&gt;msg&lt;/STRONG&gt;&lt;/TD&gt;&lt;TD width="25%"&gt;&lt;STRONG&gt;time&lt;/STRONG&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="25%"&gt;"192.168.1.1"&lt;/TD&gt;&lt;TD width="25%"&gt;0&lt;/TD&gt;&lt;TD width="25%"&gt;"msg1"&lt;/TD&gt;&lt;TD width="25%"&gt;"2023-07-01T01:00:00"&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="25%"&gt;"192.168.1.1"&lt;/TD&gt;&lt;TD width="25%"&gt;0&lt;/TD&gt;&lt;TD width="25%"&gt;"msg2"&lt;/TD&gt;&lt;TD width="25%"&gt;"2023-07-01T02:00:00"&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD width="25%"&gt;"192.168.1.1"&lt;/TD&gt;&lt;TD width="25%"&gt;0&lt;/TD&gt;&lt;TD width="25%"&gt;"msg3"&lt;/TD&gt;&lt;TD width="25%"&gt;"2023-07-01T03:00:00"&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;"192.168.1.2"&lt;/TD&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;TD&gt;"msg1"&lt;/TD&gt;&lt;TD&gt;"2023-07-02T01:00:00"&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;"192.168.1.2"&lt;/TD&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;TD&gt;"msg1"&lt;/TD&gt;&lt;TD&gt;"2023-07-02T02:00:00"&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My strategy so far was:&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;In the raw data (_raw), replace all dots in IP addresses by underscores, to avoid the dot notation hassle&lt;/LI&gt;&lt;LI&gt;Then use foreach to generate "iterator variables" for each ip entry&lt;/LI&gt;&lt;LI&gt;Then iterate over all "iterator variables", use &amp;lt;&amp;lt;MATCHSTR&amp;gt;&amp;gt; as a placeholder for all spath operations within each IP address's sub-tree.&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;Something along the lines of&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| rex field=_raw mode=sed "s/192.168.([0-9]{1,3}).([0-9]{1,3})/192_168_\1_\2/g"
| foreach data.*.ip [ eval iterator_&amp;lt;&amp;lt;MATCHSTR&amp;gt;&amp;gt;='&amp;lt;&amp;lt;FIELD&amp;gt;&amp;gt;']
| foreach iterator_* [
    spath path=data.&amp;lt;&amp;lt;MATCHSTR&amp;gt;&amp;gt;.list{} output=&amp;lt;&amp;lt;MATCHSTR&amp;gt;&amp;gt;_json
    | eval &amp;lt;&amp;lt;MATCHSTR&amp;gt;&amp;gt;_json=mvmap(&amp;lt;&amp;lt;MATCHSTR&amp;gt;&amp;gt;_json, &amp;lt;&amp;lt;MATCHSTR&amp;gt;&amp;gt;_json."##&amp;lt;&amp;lt;MATCHSTR&amp;gt;&amp;gt;")
    | eval messages=mvappend(messages, &amp;lt;&amp;lt;MATCHSTR&amp;gt;&amp;gt;_json)
    | fields - &amp;lt;&amp;lt;MATCHSTR&amp;gt;&amp;gt;_json ]&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;But the problems start with the fact that rex applied to _raw does not seem to have the desired effect. The closest I get are iterator variables still containing dotted IP addresses, such as "iterator_192.168.1.1". (This behaviour might be difficult to reproduce with makeresults sample data!)&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;What am I missing here?&lt;/P&gt;</description>
      <pubDate>Wed, 12 Jul 2023 16:13:29 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Flatten-JSON-associative-array-with-variable-key-names-that-even/m-p/650246#M224795</guid>
      <dc:creator>rikinet</dc:creator>
      <dc:date>2023-07-12T16:13:29Z</dc:date>
    </item>
    <item>
      <title>Re: Flatten JSON associative array with variable key names that even containing dots</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Flatten-JSON-associative-array-with-variable-key-names-that-even/m-p/650498#M224886</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/255974"&gt;@rikinet&lt;/a&gt;.,&lt;/P&gt;&lt;P&gt;I've had a crack at this and I think I've found a working solution.&lt;/P&gt;&lt;P&gt;Splunk has a few JSON commands (&lt;A href="https://docs.splunk.com/Documentation/SCS/current/SearchReference/JSONFunctions#json_extract_exact.28.26lt.3Bjson.26gt.3B.2C_.26lt.3Bkeys.26gt.3B.29" target="_self"&gt;see docs&lt;/A&gt;) that come in handy.&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| makeresults | eval json="{\"data\":{\"192.168.1.1\":{\"ip\":\"192.168.1.1\",\"number\":0,\"list\":[{\"msg\":\"msg1\",\"time\":\"2023-07-01T01:00:00\"},{\"msg\":\"msg2\",\"time\":\"2023-07-01T02:00:00\"},{\"msg\":\"msg3\",\"time\":\"2023-07-01T03:00:00\"}]},\"192.168.1.2\":{\"ip\":\"192.168.1.2\",\"number\":2,\"list\":[{\"msg\":\"msg1\",\"time\":\"2023-07-02T01:00:00\"},{\"msg\":\"msg2\",\"time\":\"2023-07-02T02:00:00\"}]}}}"
| fromjson json
``` Above is just creating the test data ```

``` Get the "data" field from the JSON```
| eval ips = json_extract(json, "data")
| fields ips

``` Extract all the IPs - we'll use these later ```
| eval keys = json_keys(ips)
| eval keys= json_array_to_mv(keys)
| mvexpand keys

``` Now we have one IP address per row. Next we get each IPs specific data ```
| foreach keys [ | eval values = json_extract_exact(ips, &amp;lt;&amp;lt;FIELD&amp;gt;&amp;gt;)]
| fields keys, values
| fromjson values
| mvexpand list
| fromjson list

``` Now table it out```
| table ip, number, msg, time&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;That results in a table like this:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="spavin_0-1689318896673.png" style="width: 999px;"&gt;&lt;img src="https://community.splunk.com/t5/image/serverpage/image-id/26260iED367305C1C650C1/image-size/large?v=v2&amp;amp;px=999" role="button" title="spavin_0-1689318896673.png" alt="spavin_0-1689318896673.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;As per your notes:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;The key names are dynamically identified under "data". The fact that they have dots doesn't really matter for the json commands used.&lt;/LI&gt;&lt;LI&gt;We mvxpand the "data.X.list" variable to make sure we get them all.&lt;/LI&gt;&lt;LI&gt;"data.X.number" is just any number - it's just an extra field in the list.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hopefully this query gets you closer to what you're looking for.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Cheers,&lt;BR /&gt;Daniel&lt;/P&gt;</description>
      <pubDate>Fri, 14 Jul 2023 07:19:19 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Flatten-JSON-associative-array-with-variable-key-names-that-even/m-p/650498#M224886</guid>
      <dc:creator>spavin</dc:creator>
      <dc:date>2023-07-14T07:19:19Z</dc:date>
    </item>
    <item>
      <title>Re: Flatten JSON associative array with variable key names that even contain dots</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Flatten-JSON-associative-array-with-variable-key-names-that-even/m-p/650532#M224901</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.splunk.com/t5/user/viewprofilepage/user-id/147730"&gt;@spavin&lt;/a&gt;&amp;nbsp;, thank you very much - that works! Great help!&lt;BR /&gt;&lt;BR /&gt;Just a small addition: I think we can even skip the 'foreach' command and simply write&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;| eval values = json_extract_exact(ips, keys)&lt;/LI-CODE&gt;&lt;P&gt;&lt;BR /&gt;...because at this point after the 'mvexpand', 'keys' only contains a single key.&lt;/P&gt;</description>
      <pubDate>Fri, 14 Jul 2023 10:30:39 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Flatten-JSON-associative-array-with-variable-key-names-that-even/m-p/650532#M224901</guid>
      <dc:creator>rikinet</dc:creator>
      <dc:date>2023-07-14T10:30:39Z</dc:date>
    </item>
    <item>
      <title>Re: Flatten JSON associative array with variable key names that even contain dots</title>
      <link>https://community.splunk.com/t5/Splunk-Search/Flatten-JSON-associative-array-with-variable-key-names-that-even/m-p/650743#M224999</link>
      <description>&lt;P&gt;Well spotted!&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;I had a couple of variations, and left this in but you're right, the foreach isn't needed here.&lt;/P&gt;</description>
      <pubDate>Mon, 17 Jul 2023 06:46:10 GMT</pubDate>
      <guid>https://community.splunk.com/t5/Splunk-Search/Flatten-JSON-associative-array-with-variable-key-names-that-even/m-p/650743#M224999</guid>
      <dc:creator>danspav</dc:creator>
      <dc:date>2023-07-17T06:46:10Z</dc:date>
    </item>
  </channel>
</rss>

