Splunk Search

Merging data from JSON snippets

rikinet
Path Finder

Dear Splunk community,

I have following sample input data, containing JSON snippets in MV fields:

 

| makeresults count=5 | streamstats count as a | eval _time = _time + (60*a)
| eval
  json1="{\"id\":1,\"attrib_A\":\"A1\"}#{\"id\":2,\"attrib_A\":\"A2\"}#{\"id\":3,\"attrib_A\":\"A3\"}#{\"id\":4,\"attrib_A\":\"A4\"}#{\"id\":5,\"attrib_A\":\"A5\"}",
  json2="{\"id\":2,\"attrib_B\":\"B2\"}#{\"id\":3,\"attrib_B\":\"B3\"}#{\"id\":4,\"attrib_B\":\"B4\"}#{\"id\":6,\"attrib_B\":\"B6\"}"
| makemv delim="#" json1
| makemv delim="#" json2
| table _time, json1, json2

 

The lists of ids in json1 and json2 may be disjoint, identical or overlap. For example, in above data, id=1 and id=5 only exist in json1, id=6 only exists in json2, the other ids exist in both. Attributes can be null values, but may then be treated as if the id didn't exist.

For each event, I would like to merge the data from json1 and json2 into a single table with columns id, attrib_A and attrib_B.

The expected output for the sample data would be:

_timeidattrib_Aattrib_B
t1A1null
t2A2B2
t3A3B3
t4A4B4
t5A5null
t6nullB6
............
t+51A1null
t+52A2B2
t+53A3B3
t+54A4B4
t+55A5null
t+56nullB6


How can I achieve this in a straighforward way?

The following works for the sample data, but it seems overly complicated and am not sure if it works in all cases:

 

```insert after above sample data generation:```

```extract and expand JSONs```
| mvexpand json2
| spath input=json2
| rename id as json2_id

| mvexpand json1
| spath input=json1
| rename id as json1_id

| table _time, json1_id, attrib_A, json2_id, attrib_B

```create mv fields containing the subsets of IDs from json1 and json2```
| eventstats values(json1_id) as json1, values(json2_id) as json2 by _time
| eval only_json1=mvmap(json1, if(isnull(mvfind(json2, json1)), json1, null()))
| eval only_json2=mvmap(json2, if(isnull(mvfind(json1, json2)), json2, null()))
| eval both=mvmap(json1, if(isnotnull(mvfind(json2, json1)), json1, null()))

| table _time, json1_id, attrib_A, json2_id, attrib_B, json1, json2, only_json1, only_json2, both

```keep json2 record if a) json2_id equals json1_id or b) json2_id does not appear in json1```
| eval attrib_B=if(json2_id==json1_id or isnull(mvfind(json1, json2_id)), attrib_B, null())
| eval json2_id=if(json2_id==json1_id or isnull(mvfind(json1, json2_id)), json2_id, null())

```keep json1 record if a) json1_id equals json2_id or b) json1_id does not appear in json2```
| eval attrib_A=if(json1_id==json2_id or isnull(mvfind(json2, json1_id)), attrib_A, null())
| eval json1_id=if(json1_id==json2_id or isnull(mvfind(json2, json1_id)), json1_id, null())

```remove records where json1 and json2 are both null```
| where isnotnull(json1_id) or isnotnull(json2_id)

| table _time, json1_id, attrib_A, json2_id, attrib_B
| dedup _time, json1_id, attrib_A

 

 Thank you!

Labels (3)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

Either I'm missing something here or you're overly complicating it.

Why there is a separate json1 and json2?

Sticking with your mockup data

| makeresults count=5 | streamstats count as a | eval _time = _time + (60*a)
| eval
json1="{\"id\":1,\"attrib_A\":\"A1\"}#{\"id\":2,\"attrib_A\":\"A2\"}#{\"id\":3,\"attrib_A\":\"A3\"}#{\"id\":4,\"attrib_A\":\"A4\"}#{\"id\":5,\"attrib_A\":\"A5\"}",
json2="{\"id\":2,\"attrib_B\":\"B2\"}#{\"id\":3,\"attrib_B\":\"B3\"}#{\"id\":4,\"attrib_B\":\"B4\"}#{\"id\":6,\"attrib_B\":\"B6\"}"
| makemv delim="#" json1
| makemv delim="#" json2
| eval json=mvappend(json1,json2)
| fields _time json
| mvexpand json
| spath input=json
| fields - json
| stats values(attrib_*) as * by _time id

You can of course add fillnull at the end.

livehybrid
Champion

Hi @rikinet 

Would the following achieve what you're looking for?

livehybrid_0-1741187010912.png

 

| makeresults count=5 
| streamstats count as a 
| eval _time = _time + (60*a) 
| eval
    json1="{\"id\":1,\"attrib_A\":\"A1\"}#{\"id\":2,\"attrib_A\":\"A2\"}#{\"id\":3,\"attrib_A\":\"A3\"}#{\"id\":4,\"attrib_A\":\"A4\"}#{\"id\":5,\"attrib_A\":\"A5\"}",
    json2="{\"id\":2,\"attrib_B\":\"B2\"}#{\"id\":3,\"attrib_B\":\"B3\"}#{\"id\":4,\"attrib_B\":\"B4\"}#{\"id\":6,\"attrib_B\":\"B6\"}" 
| makemv delim="#" json1 
| makemv delim="#" json2
``` end data prep ```
| eval data=mvappend(json1,json2)
| mvexpand data
| spath input=data path=id output=id
| spath input=data path=attrib_A output=attrib_A
| spath input=data path=attrib_B output=attrib_B
| stats values(attrib_A) as attrib_A values(attrib_B) as attrib_B by id
| table id, attrib_A, attrib_B

Please let me know how you get on and consider adding karma to this or any other answer if it has helped.
Regards

Will

0 Karma
Get Updates on the Splunk Community!

Say goodbye to manually analyzing phishing and malware threats with Splunk Attack ...

In today’s evolving threat landscape, we understand you’re constantly bombarded with phishing and malware ...

AppDynamics is now part of Splunk Ideas

Hello Splunkers, We have exciting news for you! AppDynamics has been added to the Splunk Ideas Portal. Which ...

Advanced Splunk Data Management Strategies

Join us on Wednesday, May 14, 2025, at 11 AM PDT / 2 PM EDT for an exclusive Tech Talk that delves into ...