Dear Splunk community,
I have following sample input data, containing JSON snippets in MV fields:
| makeresults count=5 | streamstats count as a | eval _time = _time + (60*a)
| eval
json1="{\"id\":1,\"attrib_A\":\"A1\"}#{\"id\":2,\"attrib_A\":\"A2\"}#{\"id\":3,\"attrib_A\":\"A3\"}#{\"id\":4,\"attrib_A\":\"A4\"}#{\"id\":5,\"attrib_A\":\"A5\"}",
json2="{\"id\":2,\"attrib_B\":\"B2\"}#{\"id\":3,\"attrib_B\":\"B3\"}#{\"id\":4,\"attrib_B\":\"B4\"}#{\"id\":6,\"attrib_B\":\"B6\"}"
| makemv delim="#" json1
| makemv delim="#" json2
| table _time, json1, json2
The lists of ids in json1 and json2 may be disjoint, identical or overlap. For example, in above data, id=1 and id=5 only exist in json1, id=6 only exists in json2, the other ids exist in both. Attributes can be null values, but may then be treated as if the id didn't exist.
For each event, I would like to merge the data from json1 and json2 into a single table with columns id, attrib_A and attrib_B.
The expected output for the sample data would be:
_time | id | attrib_A | attrib_B |
t | 1 | A1 | null |
t | 2 | A2 | B2 |
t | 3 | A3 | B3 |
t | 4 | A4 | B4 |
t | 5 | A5 | null |
t | 6 | null | B6 |
... | ... | ... | ... |
t+5 | 1 | A1 | null |
t+5 | 2 | A2 | B2 |
t+5 | 3 | A3 | B3 |
t+5 | 4 | A4 | B4 |
t+5 | 5 | A5 | null |
t+5 | 6 | null | B6 |
How can I achieve this in a straighforward way?
The following works for the sample data, but it seems overly complicated and am not sure if it works in all cases:
```insert after above sample data generation:```
```extract and expand JSONs```
| mvexpand json2
| spath input=json2
| rename id as json2_id
| mvexpand json1
| spath input=json1
| rename id as json1_id
| table _time, json1_id, attrib_A, json2_id, attrib_B
```create mv fields containing the subsets of IDs from json1 and json2```
| eventstats values(json1_id) as json1, values(json2_id) as json2 by _time
| eval only_json1=mvmap(json1, if(isnull(mvfind(json2, json1)), json1, null()))
| eval only_json2=mvmap(json2, if(isnull(mvfind(json1, json2)), json2, null()))
| eval both=mvmap(json1, if(isnotnull(mvfind(json2, json1)), json1, null()))
| table _time, json1_id, attrib_A, json2_id, attrib_B, json1, json2, only_json1, only_json2, both
```keep json2 record if a) json2_id equals json1_id or b) json2_id does not appear in json1```
| eval attrib_B=if(json2_id==json1_id or isnull(mvfind(json1, json2_id)), attrib_B, null())
| eval json2_id=if(json2_id==json1_id or isnull(mvfind(json1, json2_id)), json2_id, null())
```keep json1 record if a) json1_id equals json2_id or b) json1_id does not appear in json2```
| eval attrib_A=if(json1_id==json2_id or isnull(mvfind(json2, json1_id)), attrib_A, null())
| eval json1_id=if(json1_id==json2_id or isnull(mvfind(json2, json1_id)), json1_id, null())
```remove records where json1 and json2 are both null```
| where isnotnull(json1_id) or isnotnull(json2_id)
| table _time, json1_id, attrib_A, json2_id, attrib_B
| dedup _time, json1_id, attrib_A
Thank you!
Either I'm missing something here or you're overly complicating it.
Why there is a separate json1 and json2?
Sticking with your mockup data
| makeresults count=5 | streamstats count as a | eval _time = _time + (60*a)
| eval
json1="{\"id\":1,\"attrib_A\":\"A1\"}#{\"id\":2,\"attrib_A\":\"A2\"}#{\"id\":3,\"attrib_A\":\"A3\"}#{\"id\":4,\"attrib_A\":\"A4\"}#{\"id\":5,\"attrib_A\":\"A5\"}",
json2="{\"id\":2,\"attrib_B\":\"B2\"}#{\"id\":3,\"attrib_B\":\"B3\"}#{\"id\":4,\"attrib_B\":\"B4\"}#{\"id\":6,\"attrib_B\":\"B6\"}"
| makemv delim="#" json1
| makemv delim="#" json2
| eval json=mvappend(json1,json2)
| fields _time json
| mvexpand json
| spath input=json
| fields - json
| stats values(attrib_*) as * by _time id
You can of course add fillnull at the end.
Hi @rikinet
Would the following achieve what you're looking for?
| makeresults count=5
| streamstats count as a
| eval _time = _time + (60*a)
| eval
json1="{\"id\":1,\"attrib_A\":\"A1\"}#{\"id\":2,\"attrib_A\":\"A2\"}#{\"id\":3,\"attrib_A\":\"A3\"}#{\"id\":4,\"attrib_A\":\"A4\"}#{\"id\":5,\"attrib_A\":\"A5\"}",
json2="{\"id\":2,\"attrib_B\":\"B2\"}#{\"id\":3,\"attrib_B\":\"B3\"}#{\"id\":4,\"attrib_B\":\"B4\"}#{\"id\":6,\"attrib_B\":\"B6\"}"
| makemv delim="#" json1
| makemv delim="#" json2
``` end data prep ```
| eval data=mvappend(json1,json2)
| mvexpand data
| spath input=data path=id output=id
| spath input=data path=attrib_A output=attrib_A
| spath input=data path=attrib_B output=attrib_B
| stats values(attrib_A) as attrib_A values(attrib_B) as attrib_B by id
| table id, attrib_A, attrib_B
Please let me know how you get on and consider adding karma to this or any other answer if it has helped.
Regards
Will