Splunk Search

Merging data from JSON snippets

rikinet
Path Finder

Dear Splunk community,

I have following sample input data, containing JSON snippets in MV fields:

 

| makeresults count=5 | streamstats count as a | eval _time = _time + (60*a)
| eval
  json1="{\"id\":1,\"attrib_A\":\"A1\"}#{\"id\":2,\"attrib_A\":\"A2\"}#{\"id\":3,\"attrib_A\":\"A3\"}#{\"id\":4,\"attrib_A\":\"A4\"}#{\"id\":5,\"attrib_A\":\"A5\"}",
  json2="{\"id\":2,\"attrib_B\":\"B2\"}#{\"id\":3,\"attrib_B\":\"B3\"}#{\"id\":4,\"attrib_B\":\"B4\"}#{\"id\":6,\"attrib_B\":\"B6\"}"
| makemv delim="#" json1
| makemv delim="#" json2
| table _time, json1, json2

 

The lists of ids in json1 and json2 may be disjoint, identical or overlap. For example, in above data, id=1 and id=5 only exist in json1, id=6 only exists in json2, the other ids exist in both. Attributes can be null values, but may then be treated as if the id didn't exist.

For each event, I would like to merge the data from json1 and json2 into a single table with columns id, attrib_A and attrib_B.

The expected output for the sample data would be:

_timeidattrib_Aattrib_B
t1A1null
t2A2B2
t3A3B3
t4A4B4
t5A5null
t6nullB6
............
t+51A1null
t+52A2B2
t+53A3B3
t+54A4B4
t+55A5null
t+56nullB6


How can I achieve this in a straighforward way?

The following works for the sample data, but it seems overly complicated and am not sure if it works in all cases:

 

```insert after above sample data generation:```

```extract and expand JSONs```
| mvexpand json2
| spath input=json2
| rename id as json2_id

| mvexpand json1
| spath input=json1
| rename id as json1_id

| table _time, json1_id, attrib_A, json2_id, attrib_B

```create mv fields containing the subsets of IDs from json1 and json2```
| eventstats values(json1_id) as json1, values(json2_id) as json2 by _time
| eval only_json1=mvmap(json1, if(isnull(mvfind(json2, json1)), json1, null()))
| eval only_json2=mvmap(json2, if(isnull(mvfind(json1, json2)), json2, null()))
| eval both=mvmap(json1, if(isnotnull(mvfind(json2, json1)), json1, null()))

| table _time, json1_id, attrib_A, json2_id, attrib_B, json1, json2, only_json1, only_json2, both

```keep json2 record if a) json2_id equals json1_id or b) json2_id does not appear in json1```
| eval attrib_B=if(json2_id==json1_id or isnull(mvfind(json1, json2_id)), attrib_B, null())
| eval json2_id=if(json2_id==json1_id or isnull(mvfind(json1, json2_id)), json2_id, null())

```keep json1 record if a) json1_id equals json2_id or b) json1_id does not appear in json2```
| eval attrib_A=if(json1_id==json2_id or isnull(mvfind(json2, json1_id)), attrib_A, null())
| eval json1_id=if(json1_id==json2_id or isnull(mvfind(json2, json1_id)), json1_id, null())

```remove records where json1 and json2 are both null```
| where isnotnull(json1_id) or isnotnull(json2_id)

| table _time, json1_id, attrib_A, json2_id, attrib_B
| dedup _time, json1_id, attrib_A

 

 Thank you!

Labels (3)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

Either I'm missing something here or you're overly complicating it.

Why there is a separate json1 and json2?

Sticking with your mockup data

| makeresults count=5 | streamstats count as a | eval _time = _time + (60*a)
| eval
json1="{\"id\":1,\"attrib_A\":\"A1\"}#{\"id\":2,\"attrib_A\":\"A2\"}#{\"id\":3,\"attrib_A\":\"A3\"}#{\"id\":4,\"attrib_A\":\"A4\"}#{\"id\":5,\"attrib_A\":\"A5\"}",
json2="{\"id\":2,\"attrib_B\":\"B2\"}#{\"id\":3,\"attrib_B\":\"B3\"}#{\"id\":4,\"attrib_B\":\"B4\"}#{\"id\":6,\"attrib_B\":\"B6\"}"
| makemv delim="#" json1
| makemv delim="#" json2
| eval json=mvappend(json1,json2)
| fields _time json
| mvexpand json
| spath input=json
| fields - json
| stats values(attrib_*) as * by _time id

You can of course add fillnull at the end.

livehybrid
Super Champion

Hi @rikinet 

Would the following achieve what you're looking for?

livehybrid_0-1741187010912.png

 

| makeresults count=5 
| streamstats count as a 
| eval _time = _time + (60*a) 
| eval
    json1="{\"id\":1,\"attrib_A\":\"A1\"}#{\"id\":2,\"attrib_A\":\"A2\"}#{\"id\":3,\"attrib_A\":\"A3\"}#{\"id\":4,\"attrib_A\":\"A4\"}#{\"id\":5,\"attrib_A\":\"A5\"}",
    json2="{\"id\":2,\"attrib_B\":\"B2\"}#{\"id\":3,\"attrib_B\":\"B3\"}#{\"id\":4,\"attrib_B\":\"B4\"}#{\"id\":6,\"attrib_B\":\"B6\"}" 
| makemv delim="#" json1 
| makemv delim="#" json2
``` end data prep ```
| eval data=mvappend(json1,json2)
| mvexpand data
| spath input=data path=id output=id
| spath input=data path=attrib_A output=attrib_A
| spath input=data path=attrib_B output=attrib_B
| stats values(attrib_A) as attrib_A values(attrib_B) as attrib_B by id
| table id, attrib_A, attrib_B

Please let me know how you get on and consider adding karma to this or any other answer if it has helped.
Regards

Will

0 Karma
Get Updates on the Splunk Community!

Cultivate Your Career Growth with Fresh Splunk Training

Growth doesn’t just happen—it’s nurtured. Like tending a garden, developing your Splunk skills takes the right ...

Introducing a Smarter Way to Discover Apps on Splunkbase

We’re excited to announce the launch of a foundational enhancement to Splunkbase: App Tiering.  Because we’ve ...

How to Send Splunk Observability Alerts to Webex teams in Minutes

As a Developer Evangelist at Splunk, my team and I are constantly tinkering with technology to explore its ...