How can I meet full outer join requirement in my search??
table a and table b with only one filed in two rows are same i want to display all rows from both the tables:
like , table a with 5 rows
and table b with 6 rows both with a common field thredname
and thread name is same in 2 rows. when i am using join type=outer i am getting 6 rows including the details from table a.
but my need is like i want 9 table where rows with common Threadname will have fully filled fields and other may have their own data.
example
Table a
Threadname adata
a1 aaa
a2 aa
a3 aa
ab1 aabb
ab2 ab
Table b
Threadname bdata
ab1 aabbb
ab2 aabbb
b1 bb
b2 bbb
b3 bbb
b4 bbb
Expected Resultant Table
Threadname adata bdata
a1 aaa
a2 aa
a3 aa
ab1 aabb aabbb
ab2 ab aabbb
b1 bb
b2 bbb
b3 bbb
b4 bbb
please help
thank you
The join
command's outer join is actually a "left outer" join technically.
To do what you want, you should either
1) don't use join at all, but use append
and then stats
index=main source=file1 | stats count by Threadname adata | append [ search index=main source=file2 | stats count by Threadname bdata] | stats values(adata) as adata values(bdata) as bdata by Threadname
2) better, do it in a single search pipeline.
index=main (source=file1 OR source=file2) | stats values(adata) as adata values(bdata) as bdata by Threadname
I understand what you're saying about having to extract fields differently, and do different search language for one side versus the other, but too often people assume there's no way to do such things in a single pipeline. I recommend posting the full details of the searches in a separate question, asking whether you can get the desired end result without using join or append.
The join
command's outer join is actually a "left outer" join technically.
To do what you want, you should either
1) don't use join at all, but use append
and then stats
index=main source=file1 | stats count by Threadname adata | append [ search index=main source=file2 | stats count by Threadname bdata] | stats values(adata) as adata values(bdata) as bdata by Threadname
2) better, do it in a single search pipeline.
index=main (source=file1 OR source=file2) | stats values(adata) as adata values(bdata) as bdata by Threadname
I understand what you're saying about having to extract fields differently, and do different search language for one side versus the other, but too often people assume there's no way to do such things in a single pipeline. I recommend posting the full details of the searches in a separate question, asking whether you can get the desired end result without using join or append.
Also, some people looking at this solution may want to be using stats list
not stats values
here, depending on your outer join use case
Why are people complacent about outer join
functionality? Shouldn't we be questioning the SPL development skills here? Why do they even provide you with the outer option, if it's just going to be a left outer join? I would logically assume that the word "outer" means "full outer" here, especially considering I know there is already a command for an outer left join...
I understand why stats
is better than join
but why even offer the outer option with join
...?
This worked for me also! Thank you!
Worked like a charm
I think this is needs improving, but it does produce the results you've described.
I created a sourcetype of joinq that has 2 sources adata.txt and bdata.txt
The following search yields the table with 9 rows.
sourcetype=joinq source="adata.txt" | join threadname [search sourcetype=joinq source!="adata.txt"] | append [search sourcetype=joinq source="adata.txt" | join threadname type=outer [search sourcetype=joinq source!="adata.txt"]] | append [search sourcetype=joinq source!="adata.txt" | join threadname type=outer [search sourcetype=joinq source="adata.txt"]] | table threadname adata bdata | dedup threadname | sort threadname
Dave
Hmm, not sure I fully understand, but you might not require a join
, but that depends on how you get your data...
If the 'tables' you mention are not tables, but sourcetypes, each row could be seen as an event, with adata
and bdata
being fields. Then something like;
... | stats first(adata) as adata first(bdata) as bdata by Threadname
would be something that could work.
You should probably give us a few sample events or a better description of your data.
/K
This worked for me . Thank you
This worked for me also. Thank you
mmm i have to use a join for this search because i want to extract same field in two different column names so that i can differentiate it as data from different files so i need a join method that can provide full outer join values.. please help 😞
So.. then you could try:
index=main source=file1 OR source=file2 | stats first(adata) as adata first(bdata) as bdata by Threadname
assuming that there can be only one (or less) adata per Threadname, and only one (or less) bdata per Threadname.
they are the values extracted from log files . Threadname adata and btada are the fields extracted from 2 different log files. table a from file1 and table b from file 2..
search may look like
index=main sorce=file1 |top Threadname adata | join type=outer Threadname [ search index=main sorce=file2 |top Threadname bdata]
but i am not able to obtain full outer join the output looks like
Threadname adata bdata
a1 aaa
a2 aa
a3 aa
ab1 aabb aabbb
ab2 ab aabbb
Thanks for that, but I was more interested in where the data comes from. Are they csv files, or the output of some splunk reporting command, or in separate events (i.e. table_a in one event, and table_b in another)?
/k
Sorry if i confused you. edited the question. kindly please check it !!