Splunk Search

Use Join but also display non matching datasets

davidb89
Engager

I'm currenty trying to combine data from our firewall and sysmon which is running on a testclient. I want to join the Commandline and the PID of the causing process to the firewall information. That works pretty well but I can then only see the datasets which were sucessfully joined. I want to extend the datasets that have been joined but I also want to keep the ones that couldn't be joined. This is my search:

sourcetype="pfsensefirewall" source_address=10.0.2.10 firewall_action=block | join source_address, source_port, destinatin_port, destination_address [search sourcetype=nxlog_sysmon | rename SourceIp AS source_address, SourcePort AS source_port, DestinationPort AS destination_port, DestinationIp AS destination_address] | table source_address, destination_address, destination_port, Image

How can I extend the datasets that match the join but stell display all the unmatched ones?

Tags (2)
0 Karma
1 Solution

somesoni2
Revered Legend

With join (which is very expensive BTW), you get option to do left join (keep everything from LHS of join) and can be done like this (modified the join subsearch to include aggregation command)

sourcetype="pfsensefirewall" source_address=10.0.2.10 firewall_action=block | join type=left source_address, source_port, destinatin_port, destination_address [search sourcetype=nxlog_sysmon  | stats count by SourceIp SourcePort DestinationPort DestinationoIp | rename SourceIp AS source_address, SourcePort AS source_port, DestinationPort AS destination_port, DestinationIp AS destination_address] | table source_address, destination_address, destination_port, Image

But there may be a better alternative using stats. See if this give your results that you need (guessing it would work as you just doesn't seem to bother about time)

( sourcetype="pfsensefirewall" source_address=10.0.2.10 firewall_action=block) OR (sourcetype=nxlog_sysmon)
| eval source_address=coalesce(source_address,SourceIp) | eval source_port=coalesce(source_port,SourcePort)
| eval destination_address=coalesce(destination_address,DestinationIp) | eval destination_port=coalesce(destination_port,DestinationPort)
| stats values(Image) as Image by source_address, destination_address, destination_port

View solution in original post

somesoni2
Revered Legend

With join (which is very expensive BTW), you get option to do left join (keep everything from LHS of join) and can be done like this (modified the join subsearch to include aggregation command)

sourcetype="pfsensefirewall" source_address=10.0.2.10 firewall_action=block | join type=left source_address, source_port, destinatin_port, destination_address [search sourcetype=nxlog_sysmon  | stats count by SourceIp SourcePort DestinationPort DestinationoIp | rename SourceIp AS source_address, SourcePort AS source_port, DestinationPort AS destination_port, DestinationIp AS destination_address] | table source_address, destination_address, destination_port, Image

But there may be a better alternative using stats. See if this give your results that you need (guessing it would work as you just doesn't seem to bother about time)

( sourcetype="pfsensefirewall" source_address=10.0.2.10 firewall_action=block) OR (sourcetype=nxlog_sysmon)
| eval source_address=coalesce(source_address,SourceIp) | eval source_port=coalesce(source_port,SourcePort)
| eval destination_address=coalesce(destination_address,DestinationIp) | eval destination_port=coalesce(destination_port,DestinationPort)
| stats values(Image) as Image by source_address, destination_address, destination_port

davidb89
Engager

Thanks for your answer! It does work well and as you guessed correctly time is not a factor in this case. But out of curiosity: Is there any way to get the same result in a more efficient way?

0 Karma

davidb89
Engager

The more effective solution comes with some serious problems. We don't currently have the exact same basis of data in both sourcetypes. When I'm using the stats search the datasets are merged eventhough there often aren't two matchind entries. I only want them to show up if there are two matching entries. I looked at all the documentation I could find about the coalesce method but it does not seem to be possible, is it?

0 Karma

MuS
Legend

the stats search created by @somesoni2 is your efficient way to do it 😉

cheers, MuS

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...