I have 3 data sets (say src1, src2, sr3), with merged resultsets of single merge greater than the 50k limit - hence normal are ruled out. In this case src1 references a field in src2 and src2 references a different field in src3. The objective is to merge across all three data sets based on certain conditions in two individual merges (src1 & src2 ; sr2 & src3). I managed to get two sources merged, however merging the third seems perplexing due to limitations on sub-search or join over the stats results. Any thoughts ?
Code for merging two sources:
sourcetype="srctype:source1" OR sourcetype="srctype:source2"
| eval new_gid = coalesce(ref_src1,ref_src2)
| stats list(field1_src1) as field1_src1
list(field2_src1) as field2_src1
list(field3_src1) as field3_src1
list(field1_src2) as field1_src2
list(field2_src2) as field2_src2
by new_gid
| where field3_src1="cat1" OR field3_src1="cat2"
A single stats result row for the above search would look something like this.
new_gid field1_src1 field2_src1 field3_src1 field1_src2 field2_src2
03759542db9fb2404 3958294 5 cat1 69.13 c2bf762edb7b1600ab
266.7 7f4f9f7adb44e600b1
0.01 7f4f9f7adb44e600b1
This result-set would have to be merged with src_3 using field2_sr2, so that only one value remains instead of three (based on values of certain fields in src3).
... View more