Merging data from JSON snippets

rikinet · ‎03-05-2025

Dear Splunk community,

I have following sample input data, containing JSON snippets in MV fields:

| makeresults count=5 | streamstats count as a | eval _time = _time + (60*a)
| eval
  json1="{\"id\":1,\"attrib_A\":\"A1\"}#{\"id\":2,\"attrib_A\":\"A2\"}#{\"id\":3,\"attrib_A\":\"A3\"}#{\"id\":4,\"attrib_A\":\"A4\"}#{\"id\":5,\"attrib_A\":\"A5\"}",
  json2="{\"id\":2,\"attrib_B\":\"B2\"}#{\"id\":3,\"attrib_B\":\"B3\"}#{\"id\":4,\"attrib_B\":\"B4\"}#{\"id\":6,\"attrib_B\":\"B6\"}"
| makemv delim="#" json1
| makemv delim="#" json2
| table _time, json1, json2

The lists of ids in json1 and json2 may be disjoint, identical or overlap. For example, in above data, id=1 and id=5 only exist in json1, id=6 only exists in json2, the other ids exist in both. Attributes can be null values, but may then be treated as if the id didn't exist.

For each event, I would like to merge the data from json1 and json2 into a single table with columns id, attrib_A and attrib_B.

The expected output for the sample data would be:

_time	id	attrib_A	attrib_B
t	1	A1	null
t	2	A2	B2
t	3	A3	B3
t	4	A4	B4
t	5	A5	null
t	6	null	B6
...	...	...	...
t+5	1	A1	null
t+5	2	A2	B2
t+5	3	A3	B3
t+5	4	A4	B4
t+5	5	A5	null
t+5	6	null	B6

How can I achieve this in a straighforward way?

The following works for the sample data, but it seems overly complicated and am not sure if it works in all cases:

```insert after above sample data generation:```

```extract and expand JSONs```
| mvexpand json2
| spath input=json2
| rename id as json2_id

| mvexpand json1
| spath input=json1
| rename id as json1_id

| table _time, json1_id, attrib_A, json2_id, attrib_B

```create mv fields containing the subsets of IDs from json1 and json2```
| eventstats values(json1_id) as json1, values(json2_id) as json2 by _time
| eval only_json1=mvmap(json1, if(isnull(mvfind(json2, json1)), json1, null()))
| eval only_json2=mvmap(json2, if(isnull(mvfind(json1, json2)), json2, null()))
| eval both=mvmap(json1, if(isnotnull(mvfind(json2, json1)), json1, null()))

| table _time, json1_id, attrib_A, json2_id, attrib_B, json1, json2, only_json1, only_json2, both

```keep json2 record if a) json2_id equals json1_id or b) json2_id does not appear in json1```
| eval attrib_B=if(json2_id==json1_id or isnull(mvfind(json1, json2_id)), attrib_B, null())
| eval json2_id=if(json2_id==json1_id or isnull(mvfind(json1, json2_id)), json2_id, null())

```keep json1 record if a) json1_id equals json2_id or b) json1_id does not appear in json2```
| eval attrib_A=if(json1_id==json2_id or isnull(mvfind(json2, json1_id)), attrib_A, null())
| eval json1_id=if(json1_id==json2_id or isnull(mvfind(json2, json1_id)), json1_id, null())

```remove records where json1 and json2 are both null```
| where isnotnull(json1_id) or isnotnull(json2_id)

| table _time, json1_id, attrib_A, json2_id, attrib_B
| dedup _time, json1_id, attrib_A

Thank you!

PickleRick · ‎03-05-2025

Either I'm missing something here or you're overly complicating it.

Why there is a separate json1 and json2?

Sticking with your mockup data

| makeresults count=5 | streamstats count as a | eval _time = _time + (60*a)
| eval
json1="{\"id\":1,\"attrib_A\":\"A1\"}#{\"id\":2,\"attrib_A\":\"A2\"}#{\"id\":3,\"attrib_A\":\"A3\"}#{\"id\":4,\"attrib_A\":\"A4\"}#{\"id\":5,\"attrib_A\":\"A5\"}",
json2="{\"id\":2,\"attrib_B\":\"B2\"}#{\"id\":3,\"attrib_B\":\"B3\"}#{\"id\":4,\"attrib_B\":\"B4\"}#{\"id\":6,\"attrib_B\":\"B6\"}"
| makemv delim="#" json1
| makemv delim="#" json2
| eval json=mvappend(json1,json2)
| fields _time json
| mvexpand json
| spath input=json
| fields - json
| stats values(attrib_*) as * by _time id

You can of course add fillnull at the end.

livehybrid · ‎03-05-2025

Hi @rikinet

Would the following achieve what you're looking for?

| makeresults count=5 
| streamstats count as a 
| eval _time = _time + (60*a) 
| eval
    json1="{\"id\":1,\"attrib_A\":\"A1\"}#{\"id\":2,\"attrib_A\":\"A2\"}#{\"id\":3,\"attrib_A\":\"A3\"}#{\"id\":4,\"attrib_A\":\"A4\"}#{\"id\":5,\"attrib_A\":\"A5\"}",
    json2="{\"id\":2,\"attrib_B\":\"B2\"}#{\"id\":3,\"attrib_B\":\"B3\"}#{\"id\":4,\"attrib_B\":\"B4\"}#{\"id\":6,\"attrib_B\":\"B6\"}" 
| makemv delim="#" json1 
| makemv delim="#" json2
``` end data prep ```
| eval data=mvappend(json1,json2)
| mvexpand data
| spath input=data path=id output=id
| spath input=data path=attrib_A output=attrib_A
| spath input=data path=attrib_B output=attrib_B
| stats values(attrib_A) as attrib_A values(attrib_B) as attrib_B by id
| table id, attrib_A, attrib_B

Please let me know how you get on and consider adding karma to this or any other answer if it has helped.
Regards

Will

Merging data from JSON snippets

join

stats

table

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

What’s New in Splunk AI: Volume 02

Splunk App Dev Quarterly Roundup: AI, Agents, and Innovation!

Value Insights: Now Generally Available in the CMC

Join the Conversation

Merging data from JSON snippets

join

stats

table

Join the Splunk Community Slack to learn, troubleshoot, and make connections with fellow Splunk practitioners in real time!

Join Splunk User Groups to connect and learn in-person by region or remotely by topic or industry.

What’s New in Splunk AI: Volume 02

Splunk App Dev Quarterly Roundup: AI, Agents, and Innovation!

Value Insights: Now Generally Available in the CMC