Parse json - relate parent key to child-array valu...

n_young · ‎08-17-2019

Source JSON Structure:

{
      "working": {
        "https://site.number.one": [
          {
            "metric": "html",
            "type": "requests",
            "value": 1,
            "friendlyValue": 1,
            "limit": 1,
            "friendlyLimit": 1,
            "limitType": "max",
            "status": "working"
          },
          {
            "metric": "html",
            "type": "transferSize",
            "value": 10662,
            "friendlyValue": "10.4 KB",
            "limit": 20000,
            "friendlyLimit": "19.5 KB",
            "limitType": "max",
            "status": "working"
          }
        ],
        "https://site.number.two": [
          {
            "metric": "html",
            "type": "requests",
            "value": 1,
            "friendlyValue": 1,
            "limit": 1,
            "friendlyLimit": 1,
            "limitType": "max",
            "status": "working"
          },
          {
            "metric": "html",
            "type": "transferSize",
            "value": 10662,
            "friendlyValue": "10.4 KB",
            "limit": 20000,
            "friendlyLimit": "19.5 KB",
            "limitType": "max",
            "status": "working"
          }
        ]
  },
  "failing": {
    "https://site.number.two": [
      {
        "metric": "firstPaint",
        "type": "timings",
        "value": 1392,
        "friendlyValue": "1.392 s",
        "limit": 1000,
        "friendlyLimit": "1.000 s",
        "limitType": "max",
        "status": "failing"
      }
    ],
    "https://site.number.four": [
      {
        "metric": "firstPaint",
        "type": "timings",
        "value": 1665,
        "friendlyValue": "1.665 s",
        "limit": 1000,
        "friendlyLimit": "1.000 s",
        "limitType": "max",
        "status": "failing"
      }
  ]
  }
}

So basically there's two root-level arrays ("working" and "failing") and then within each of those, there is a "site/url" and then within each "url" there's an array of metrics/values etc.

What i'm trying to do, is parse out the URLs (sites), and then their related metrics for comparison/charting etc. I can't figure out what the right combination of spath / field extraction / etc is, so i can chart things like the "html" size as it relates to the "firstPaint" metric. There are hundreds of other metrics, but i just pulled out a few in that source JSON as an example.

Ideally what i'm left with after the parsing, is an ability to work with these fields: Site, metric name, value, status. If i can figure that out, im sure i can expand the logic to include the other fields as needed.

Hopefully i've explained this well enough to describe what i'm trying to do.

to4kawa · ‎01-10-2020

| makeresults 
| eval _raw="{
       \"working\": {
         \"https://site.number.one\": [
           {
             \"metric\": \"html\",
             \"type\": \"requests\",
             \"value\": 1,
             \"friendlyValue\": 1,
             \"limit\": 1,
             \"friendlyLimit\": 1,
             \"limitType\": \"max\",
             \"status\": \"working\"
               },
           {
             \"metric\": \"html\",
             \"type\": \"transferSize\",
             \"value\": 10662,
             \"friendlyValue\": \"10.4 KB\",
             \"limit\": 20000,
             \"friendlyLimit\": \"19.5 KB\",
             \"limitType\": \"max\",
             \"status\": \"working\"
               }
         ],
         \"https://site.number.two\": [
           {
             \"metric\": \"html\",
             \"type\": \"requests\",
             \"value\": 1,
             \"friendlyValue\": 1,
             \"limit\": 1,
             \"friendlyLimit\": 1,
             \"limitType\": \"max\",
             \"status\": \"working\"
               },
           {
             \"metric\": \"html\",
             \"type\": \"transferSize\",
             \"value\": 10662,
             \"friendlyValue\": \"10.4 KB\",
             \"limit\": 20000,
             \"friendlyLimit\": \"19.5 KB\",
             \"limitType\": \"max\",
             \"status\": \"working\"
               }
         ]
   },
   \"failing\": {
     \"https://site.number.two\": [
       {
         \"metric\": \"firstPaint\",
         \"type\": \"timings\",
         \"value\": 1392,
         \"friendlyValue\": \"1.392 s\",
         \"limit\": 1000,
         \"friendlyLimit\": \"1.000 s\",
         \"limitType\": \"max\",
         \"status\": \"failing\"
           }
     ],
     \"https://site.number.four\": [
       {
         \"metric\": \"firstPaint\",
         \"type\": \"timings\",
         \"value\": 1665,
         \"friendlyValue\": \"1.665 s\",
         \"limit\": 1000,
         \"friendlyLimit\": \"1.000 s\",
         \"limitType\": \"max\",
         \"status\": \"failing\"
           }
   ]
   }
 }" 
| spath path=working output=raw 
| appendpipe 
    [ spath path=failing output=raw] 
| table raw 
| eval raw=replace(raw,"\]\,","] }#{") 
| makemv delim="#" raw 
| stats count by raw 
| fields - count 
| rex field=raw "(?<site_name>http.+)\":" 
| rex field=raw mode=sed "s/(?s)^\{.+\[//g" 
| rex field=raw mode=sed "s/(?s)\].+$//g" 
| eval raw=replace(raw,"\}\,","}#") 
| makemv delim="#" raw 
| eval tmp=mvmap(raw,raw."Z".random()) 
| stats values(site_name) as site_name by tmp 
| spath input=tmp 
| fields - tmp 
| table site_name metric type value friendlyValue limit friendlyLimit limitType status

Hi, @n_young
After that, try to carry out tallying.

vik_splunk · ‎08-18-2019

@n_young

What particular use case are you trying to satisfy? Some information could help.

Natively, Splunk should be able to parse the fields necessary without having to use spath/regex.

I was able to ingest the json provided and a table and transpose produces the fields for the most part. Based on the use case necessary, we can tweak the query to produce the necessary output

n_young · ‎08-18-2019

@vik_splunk The issue is that the "site" names are diverse/variable. I just used those as examples for posting the question here. The actual URLs/sites will be completely diverse --and there will be hundreds of them in the same JSON source file(s).

So, while i could do something like " | table site.number.two | transpose | .. that would only work for the sites i specify.

I figured i'd have to use something like spath / field extraction to pull each of the sites out into an individual variable.

vik_splunk · ‎08-19-2019

I am a bit confused @n_young because what it seems to me is you are referring to parsing and event breaking index time as opposed to field extraction.

To achieve the above desired event breaking, I am afraid you will have to modify the source data to be delivered in the desired format rather than attempting to let Splunk parse the event to the desired format.

I do understand that the field names would be different than the sample regex above. What I was trying to understand was the use case to see if the same could be accomplished within the field extraction that Splunk inherently performs.

n_young · ‎08-19-2019

@vik_splunk I dont have the ability to change the source JSON structure. It's coming out of a product directly, and i dont have control of the format/structure.

My question is most closely aligned with this former post, but the solution doesn't seem to work with my source structure.

https://answers.splunk.com/answers/174698/leveraging-spath-with-arbitrary-json-fields.html

n_young · ‎08-18-2019

what im trying to do, is convert each of the sub-arrays into events, so that i can parse/chart things like you normally would with individual events.

I guess ultimately that means figuring out how to parse the source JSON, into individual events, but each individual event's time would be that of the file, and each event would also have the parent "site" included.

So, the event would end up being something like this:

         "site":"https://site.number.one",
         "metric": "html",
         "type": "requests",
         "value": 1,
         "friendlyValue": 1,
         "limit": 1,
         "friendlyLimit": 1,
         "limitType": "max",
         "status": "working"

Parse json - relate parent key to child-array values

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

Laser Bananas and Edge Hubs: Exploring Operational Technology (OT) Data Through a ...

Event Series: Mastering AI Tokenomics and Splunk Agent Observability

span_metrics: The OpenTelemetry-Idiomatic Way to See Inside Your Services

Join the Conversation