Splunk Search

Emulating FULL OUTER JOIN

andreafebbo
Communicator

Hi,
I have to tables:

ID name
1..A
2..B

ID Error
1..bla1
1..bla2

so Id like a table which is like the following

ID name error
1..A........bla1
1..A........bla2
2..B........(empty)

In SQL this is called FULL OUTER JOIN but i cannot find a way to replicate it since the parameter

type=outer

in the command join means LEFT JOIN in the reality.

I found some questions that try to resolve the issue with the search command but in my case it does not work.

Thank you

sandrosov_splun
Splunk Employee
Splunk Employee

The author needed a left join as mentioned before, but I see demand for this task is still high.

I spent a little bit of time on this. So, try the following.

| inputlookup tableA | eval key=localfield1
| append [|inputlookup tableB | eval key=localfield2]
| selfjoin max=0 keepsingle=yes key

 

0 Karma

tdiestel
Path Finder

This seems like you just want to do a Left join. so

index=A|table ID Name | join type=left max=0 ID [search index=B |table ID Error]

arlington
Explorer

Thanks, man, that's indeed the great answer, which helped me a lot. But what is the max=0 indicates here? Event If there is not any data in the index B, still it shows 1, without max=0 it calculates all of them as 1.

0 Karma

cmerriman
Super Champion

try something like this, maybe:

UPDATED

 index=MYINDEX source=MYSOURCE 
                   | eval Day = strftime(_time,"%F")
                   | eval DateTime = strftime(_time,"%Y-%m-%d %H:%M:%S")
                   | stats Latest(LogType) as status Latest(Result) as Result Latest(DateTime) as When by PackageName  ,ExecutionInstanceGUID, Day
                   | sort When
                   | streamstats  count as "Execution Nr" by PackageName, Day
                   | sort - When
                   | table When, PackageName, "Execution Nr",  status, Result, ExecutionInstanceGUID
                   | eval AlertLevel = case(Result=="OK",1,Result=="WARNING",2,Result=="KO",3)
                   | rangemap field=AlertLevel low=1-1 elevated=2-2 severe=3-3 default=guarded
                   | fields - AlertLevel
|appendpipe [stats count by ExecutionInstanceGUID | join type=outer ExecutionInstanceGUID [
                   search index=MYINDEX source=MYSOURCE
                   | rename ExplodedPackages{}.Error AS Error, ExplodedPackages{}.Package AS Package, ExplodedPackages{}.TimeStamp AS TimeStamp
                   | eval x=mvzip(TimeStamp,mvzip(Package,Error))
                   | mvexpand x
                   | eval y=mvzip(ExecutionInstanceGUID,x)
                   | mvexpand y
                   | eval z=split(y,",") 
                   | eval ExecutionInstanceGUID=mvindex(z,0)
                   | eval TimeStamp=mvindex(z,1)
                   | eval Package=mvindex(z,2)
                   | eval Error=mvindex(z,3)
                   | table ExecutionInstanceGUID,TimeStamp, Package, Error]]
|stats values(PackageName) as PackageName by ExecutionInstanceGUID Error
0 Karma

andreafebbo
Communicator

THe result of this query is the following:

1..A
2..B
1.......bla1
1.......bla2

and not:

1..A..bla1
1..A..bla2
2..B........

0 Karma

cmerriman
Super Champion

try my updated syntax. I used a stats command at the bottom to bring back the values of the PackageName by the ID and Error, so it should join the 1s and bla1/bla2s.

0 Karma

andreafebbo
Communicator

Nothing: now it shows just 3 columns:
ExecutionInstanceGUID
Error
PackageName (empty column).

In the example, what i call the column name(A, B) is one column but in the reality are many columns(as you can see in the query). I need all of them

0 Karma

cmerriman
Super Champion

you might need to play with the stats command at the bottom, join by any of the column names that might be in common, or do a latest instead of values.

|stats values(*) as * by ExecutionInstanceGUID 
0 Karma

cmerriman
Super Champion

can you provide syntax with how you get current tables? You might be able to use appendpipe.

0 Karma

davebrooking
Contributor

I understand your requirement is more involved, but given the basic datasets from above the following search produces similar results to that shown:

index=MYINDEX source=MYSOURCE | stats values(error) as error values(name) as name  by id | eval error=if(isnull(error),"NULL",error)  | mvexpand error | table id name error

Dave

0 Karma

andreafebbo
Communicator
index=MYINDEX source=MYSOURCE 
                  | eval Day = strftime(_time,"%F")
                  | eval DateTime = strftime(_time,"%Y-%m-%d %H:%M:%S")
                  | stats Latest(LogType) as status Latest(Result) as Result Latest(DateTime) as When by PackageName  ,ExecutionInstanceGUID, Day
                  | sort When
                  | streamstats  count as "Execution Nr" by PackageName, Day
                  | sort - When
                  | table When, PackageName, "Execution Nr",  status, Result, ExecutionInstanceGUID
                  | eval AlertLevel = case(Result=="OK",1,Result=="WARNING",2,Result=="KO",3)
                  | rangemap field=AlertLevel low=1-1 elevated=2-2 severe=3-3 default=guarded
                  | fields - AlertLevel
                  | join type=outer ExecutionInstanceGUID[
                         search index=MYINDEX source=MYSOURCE
                         | rename ExplodedPackages{}.Error AS Error, ExplodedPackages{}.Package AS Package, ExplodedPackages{}.TimeStamp AS TimeStamp
                         | eval x=mvzip(TimeStamp,mvzip(Package,Error))
                         | mvexpand x
                         | eval y=mvzip(ExecutionInstanceGUID,x)
                         | mvexpand y
                         | eval z=split(y,",") 
                         | eval ExecutionInstanceGUID=mvindex(z,0)
                         | eval TimeStamp=mvindex(z,1)
                         | eval Package=mvindex(z,2)
                         | eval Error=mvindex(z,3)
                         | table ExecutionInstanceGUID,TimeStamp, Package, Error
                  ]

With this syntax I get just one error for each row in the first table

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...