Now that we are thinking in Splunk terms , note that the ...... part in your illustration can make a difference in how best to construct a "solution". So, I assume that Dataset A and B are NOT fro...
See more...
Now that we are thinking in Splunk terms , note that the ...... part in your illustration can make a difference in how best to construct a "solution". So, I assume that Dataset A and B are NOT from the same sources, e.g., fields b and d must come from different sources, different sourcetypes, from different periods of time, even different indices. Without such information, volunteers have to make assumptions that may or may not be helpful. Where such have biggest impact would be when two datasets come from differing indices and/or time periods. For simplicity, I will assume a common scenario when both datasets come from the same index and same time period. Further assume that the only differentiating factor is sourcetype, A and B. An effective OR would be between these two. index=common_index ((sourcetype = A) OR (sourcetype = B)) Now, the above is often expressed as index=common_index sourcetype IN (A, B) Meanwhile, you may often have additional, differing search terms for A and B. So you may want to keep those parentheses. For example, you may want to restrict events to only those with fully populated fields of interest, index=common_index ((sourcetype = A a=* b=* c=*) OR (sourcetype = B a=* d=* e=* f=*)) Anyway, my previous post only demonstrated how to leverage any key as "primary key", but did not include the final step in an outer join. Here it is for your scenario: | stats values(*) as * by a
| fields a b c d e f
| foreach *
[mvexpand <<FIELD>>] Using your sample datasets, the output is a b c d e f a1 b1 c1 d1 e1 f1 a1 b1 c1 d1 e1 f2 a1 b1 c1 d1 e1 f3 a1 b1 c1 d1 e2 f1 a1 b1 c1 d1 e2 f2 a1 b1 c1 d1 e2 f3 a1 b1 c1 d1 e3 f1 a1 b1 c1 d1 e3 f2 a1 b1 c1 d1 e3 f3 a1 b1 c1 d2 e1 f1 a1 b1 c1 d2 e1 f2 a1 b1 c1 d2 e1 f3 a1 b1 c1 d2 e2 f1 a1 b1 c1 d2 e2 f2 a1 b1 c1 d2 e2 f3 a1 b1 c1 d2 e3 f1 a1 b1 c1 d2 e3 f2 a1 b1 c1 d2 e3 f3 a1 b1 c1 d3 e1 f1 a1 b1 c1 d3 e1 f2 Here is an emulation that you can play with and compare with real data | makeresults
| eval _raw = "a,b,c
a1,b1,c1
a2,b2,c2"
| multikv forceheader=1
| fields - _* linecount
| eval sourcetype = "A"
| append
[makeresults
| eval _raw = "a,d,e,f
a1,d1,e1,f1
a1,d2,e2,f2
a1,d3,e3,f3
a2,d4,e4,f4
a2,d5,e5,f5"
| multikv forceheader=1
| fields - _* linecount
| eval sourcetype = "B"]
``` data emulation above ```