Convert nested JSON into a table

cherrypick · ‎08-26-2024

Here I have 2 event statistics (id=superman & id=batman) in Json format. How do I arrange it in a table format based on the id and enemy information values? This is slightly different to previous questions as for the "enemies"-key values, the "enemy_information" has the value headers while "enemy_information_values" contains a list of values matching the "enemy_information" value headers.

For example, I want the result to look something like the table below.

I know the data needs pre-processing, but I wanted to know if it was possible to do so via SPL commands. The reasoning to avoid pre-processing is I already have previous data with the same format that are too far back for me to re-ingest.

[{

"id": "Superman",

"birthName": "Kal-El",

"origin": "Krypton",

"enemies": [

{

"enemy_information": [

"name",

"location",

"powers"

],

"enemy_information_values": [

[

"Doomsday",

"Kryptonian Prison",

[

"Super Strength",

"Invulnerability",

"Regeneration",

"Adaptation",

"Enhanced Durability",

"Immunity to Kryptonite"

]

],

[

"Lex Luthor",

"Metropolis",

[

"Genius-level Intellect",

"Skilled Strategist",

"Advanced Technology and Weaponry",

"Political Influence",

"Expert in Kryptonite"

]

}

]

},

{

"id": "Batman",

"birthName": "Bruce Wayne",

"origin": "Gotham City",

"enemies": [

{

"enemy_information": [

"name",

"location",

"powers"

],

"enemy_information_values": [

[

"Joker",

"Gotham City",

[

"Genius-level Intellect",

"Master of Psychological Manipulation",

"Skilled Hand-to-Hand Combatant",

"Expert in Criminal Psychology",

"Master of Disguise"

]

],

[

"Two-Face",

"Gotham City",

[

"Expert Marksman",

"Skilled Hand-to-Hand Combatant",

"Access to Advanced Weaponry",

"Strategic Mind",

"Psychological Trauma"

]

}

]

}]

cherrypick · ‎08-27-2024

nevermind doesn't work 😞

gcusello · ‎08-27-2024

Hi @cherrypick ,

good for you, see next time!

For the other people of Community, please describe how you solved the issue.

Ciao and happy splunking

Giuseppe

P.S.: Karma Points are appreciated by all the contributors 😉

cherrypick · ‎08-28-2024

I see thank you for letting me know! 😄

PickleRick · ‎08-26-2024

OK. I think I already told you about badly formed data. While in some cases you can argue which json structure will be better to represent your data, this one is clearly not a good approach. Especially for Splunk.

Let's take this snippet:

 "enemy_information": ["name", "location", "powers" ],
 "enemy_information_values": [
  [ "Doomsday", "Kryptonian Prison", [ "Super Strength", [...] "Immunity to Kryptonite"]
] [...]

There is no structural relation between enemy_information and enemy_information_values.

From Splunk's point of view those will parse out (leaving aside possibly nested multivalued fields which is not straightforward to deal with) as two separate multivalued fields with no relationship whatsoever between the values from one field and values from the other.

If anything it should be either

"enemy_attributes": {"name": "Doomsday". "location": "Seattle, WA" [...]},

or

"enemy_attributes": [ {"name":"name", "value": "Doomsday"}, {"name": "location", "value":"Paris, France"} ...]

Each option has its pros and cons but the one you're presenting only seems to have cons.

cherrypick · ‎08-26-2024

Hi PickleRick,

Yes, I'm very aware that the structure I'm working with is really bad 😅. But let's say if I do pre-process on the data (currently data input is a modular input), how am I able to do the same on previous data that's already been indexed? Furthermore, the data is continuous and I'm only able to retrieve up to max 10 days behind (I can't change this unfortunately).

So, if I adjust the pre-processing and make the data structure into something that makes sense, this will only take effect for all future data. If I wanted to display the same table, the SPL won't work with older data in the same index.

PickleRick · ‎08-28-2024

Yes, change of the data format can cause incompatibilities with earlier data. That's true.

The issue with your data in general (possibly not in the presented example) is - as I said - that you have separate arrays which splunk can parse into separate multivalued fields which are not related to one another. If you are absolutely sure that both of those multivalued fields are of the same cardinality and are 1-1 related with one another you can try to do join them using the mvzip() function. Then do mvexpand and split those values back to get corresponding pairs. One caveat though - since the values get merged into a single value, if they contain the delimiter you choose for mvzipping, it's gonna get ugly when you'll be trying to split them again.

So it's possible but pretty ugly (and working only with some strong assumptions.

gcusello · ‎08-26-2024

Hi @cherrypick ,

are you using INDEXED_EXTRACTIONS = JSON in your sourcetype?

Ciao.

Giuseppe

Convert nested JSON into a table

table

New This Month - Splunk Observability updates and improvements for faster ...

What's New in Splunk Cloud Platform 9.3.2411?

Buttercup Games: Further Dashboarding Techniques (Part 6)